Introduction

The stone crayfish Austropotamobius torrentium (Schrank, 1803) is an endangered keystone freshwater species (Streissl & Hoödl, 2002; Bedjanič, 2004; Füreder et al., 2017; Jussila et al., 2017; Manfrin et al., 2022). This species is native to large parts of central and southeastern Europe, mostly inhabiting streams at medium to high altitudes (Vlach et al., 2009; Kouba et al., 2014). Most of its genetic variability, represented also by nine distinct mtDNA lineages/phylogroups, is restricted to the Balkan Peninsula (Trontelj et al., 2005; Klobučar et al., 2013; Pârvulescu et al., 2019; Lovrenčić et al., 2020). Six of these lineages (Zeleni Vir—ZV, Gorski Kotar—GK, Žumberak, Plitvice and Bjelolasica—ŽPB, Lika and Dalmatia—LD, Kordun—KOR and BAN—Banovina) are found in the north-central Dinarides (NCD), a relatively small area (~ 10.000 km2) that covers western and southern Croatia and the associated border areas in Slovenia and Bosnia and Herzegovina. In the north and northeast (Slovenia and northern and eastern Croatia), these lineages are replaced by the Central and southeastern Europe (CSE) lineage, which is also the most widespread lineage in the European context. However, the actual range of the NCD lineages in the east remains unclear. Available data show that the CSE lineage also occurs in some places in central and eastern Bosnia and Herzegovina and that it is replaced in the central Serbia and southwards by the Southern Balkans (SB) lineage—however, these two countries are clearly underrepresented in phylogeographic studies on A. torrentium (Trontelj et al., 2005; Klobučar et al., 2013; Lovrenčić et al., 2020).

Previous studies have shown that the complex evolution and genetic diversity of A. torrentium were largely shaped by the drainage evolution and orogeny of the Balkans from the Miocene to the Pleistocene (Trontelj et al., 2005; Klobučar et al., 2013; Pârvulescu et al., 2019; Lovrenčić et al., 2020). It is widely accepted that the Dinarides uplift triggered the splitting of the ancestral Austropotamobius into the Austropotamobius pallipes (Lereboullet, 1858) and A. torrentium (Trontelj et al., 2005; Klobučar et al., 2013; Lovrenčić et al., 2020), which is why this vicariance has been proposed as a geological calibration point for molecular dating (Klobučar et al., 2013; Lovrenčić et al., 2020). However, as the Dinarides orogeny was a highly complex process that took place during the Oligocene and Miocene (Čičić et al., 2008; Schmid et al., 2008; Kováč et al., 2016), it is difficult to pinpoint a specific uplift episode that acted as an effective driver of allopatric speciation. Several authors suggest that the vicariance took place in the Middle Miocene (Klobučar et al., 2013; Jelić et al., 2016; Lovrenčić et al., 2020), but recent paleogeological evidence suggests extensive orogeny already during Oligo-Miocene (Balling et al., 2021). In search of a more reliable calibration point, Pârvulescu et al. (2019) suggested that the split between the Apuseni lineage (APU), restricted to the Apuseni Mountains in northwestern Romania, and its sister lineage could serve as a reliable calibration point. They proposed that this lineage, later described as the distinct species Austropotamobius bihariensis Pârvulescu, 2019, was separated from the Dinarides due to a tectonic displacement of the Tisza-Dacia microplate. The Tisza–Dacia microplate, which also includes the Apuseni Mountains, was separated from a larger plate that included the Dinarides at the end of the Burdigalian (~ 16 Ma) and migrated northeastward during the Miocenbihae. This means that the Apuseni Mountains existed throughout the Miocene as a large island cut off from the Dinarides first by the Paratethys and later by the Pannonian Lake (~ 11.7 Ma; Magyar et al., 2013). This vicariant event caused the concestor of APU and its sister lineage to split; because there was a continuous sea or brackish lake between the two areas until the Upper Pliocene (4.5 Ma; Mandic et al., 2015), APU was isolated from the rest of A. torrentium.

Lovrenčić et al. (2020), on the other hand, suggest that the Dinarides and the Apuseni Mountains may have been disconnected since the Triassic, which implies that the colonization of the Apuseni Mountains could not have started before the Early Pliocene. At this time, crayfish may have survived in the shallow parts of the Pannonian Lake due to the strong influx of freshwater from the surrounding rivers. They argue that the proximity of the paleo-Danube and paleo-Tisza deltas discharging into the Pannonian Lake in the Late Miocene may have allowed the ancestor of the present-day APU phylogroup to colonize the Apuseni Mountains by ~ 5.3 Ma. Namely, the paleo-Danube and paleo-Tisza shelf margins merged around 8.6 Ma, forming a unified shelf along the northern perimeter of the deep lake basin, with paleogeographic features revealing a near-perpendicular discharge of the two rivers until 5.3 Ma (Magyar et al., 2013). Moreover, this age coincides with the Messinian Salinity Crisis (MSC; 5.96–5.33 Ma) that may have caused a lowering of the water level of Pannonian Lake, at least in its northern part, further opening a path for possible colonization of the Apuseni Mountains. In addition, the two competing hypotheses provide very different time frames for the split between A. torrentium and A. pallipes and for the evolution of the A. torrentium lineages—according to Pârvulescu et al. (2019), it already took place in the Middle Eocene, while Lovrenčić et al. (2020) propose the Middle Miocene.

Incomplete sampling coverage in the Balkans, east of Croatia, and conflicting estimates of divergence times, leading to completely different time frames for the origin of A. torrentium, make reconstructing the historical biogeography of this species a challenge. In the present study, we extend the phylogeographic coverage of A. torrentium to Serbia, which played a key role in the pre-glacial southward dispersal of the SB and possibly the post-glacial northward expansion of the CSE (Lovrenčić et al., 2020). By applying additional molecular calibration points to the outgroups of the genus Austropotamobius (Bracken-Grissom et al., 2014), we further evaluated which of the two opposing hypotheses for the origin of the APU (Pârvulescu et al., 2019; Lovrenčić et al., 2020) is more likely. If molecular calibration using only outgroups would date the origin of APU to a significantly different age than that proposed for the Tisza-Dacia microplate displacement (~ 16 Ma), the hypothesis of Pârvulescu et al. (2019) should be rejected. Moreover, a much younger origin of the APU would not contradict the hypothesis proposed by Lovrenčić et al. (2020).

Materials and methods

Material acquisition and sampling area

We sampled a total of 135 crayfish at fifteen locations spread throughout Serbia from June 2017 to October 2018 (Table 1; Fig. 1; Table S1). To narrow the gap in the phylogeographic coverage of A. torrentium in the Balkans and to better understand the dispersal of this species, the present study focused on Serbia, a country in the Western Balkans that has so far been strongly underrepresented in phylogeographic studies of this species. The sampling campaign was spread throughout most of the country and focused on representative areas where this species is known or expected. We used standard methods for collecting the material, baited traps or hand searching (Parkyn, 2015; Hilber et al., 2020). We took one pereiopod or antennal tissue from each animal and fixed it in 96% ethanol at site and stored at 4°C until DNA extraction. Sampling followed all legal requirements, including national permits (353-01-102/2017-17, 353-01-142/2018-04 issued by Ministry of Environmental Protection) and established ethical standards.

Table 1 Sample locations details with a summary of COI and 16S rDNA sequencing and inferred mtDNA phylogroups
Fig. 1
figure 1

Map of sampling locations with the geographical distribution of the stone crayfish and its phylogroups and respective haplogroups. Large circles represent new sampling locations selected for this study, while smaller circles indicate the records used from published records (Trontelj et al., 2005; Schubart & Huber, 2006; Klobučar et al., 2013; Petrović et al., 2013; Pârvulescu et al., 2019; Lovrenčić et al., 2020). Figure was prepared in Inkscape v1.1

Molecular analysis

We extracted DNA from the tissue with AccuPrep Genomic DNA Extraction kit (Bioneer Corporation, Daejeon, Korea). The quality and quantity of all DNA extracts were measured by NanoPhotometer N60/N50 (Impplen, GmbH).

We amplified fragments from two mitochondrial marker genes: cytochrome oxidase 1 (COI) and 16S rRNA (16S) by primer pairs: LCO 1490/HCO 2198 and the 16Sar-L/16Sbr-H, respectively. The length of the amplified fragments was 708 bp for COI and about 507 bp for 16S (Folmer et al., 1994; Palumbi et al., 2002). The choice of these mtDNA markers made the produced sequences compatible with those already published; in addition, the 16S marker provided the resolution at the inter-lineage and the COI marker at the intra-lineage level (Trontelj et al., 2005; Schubart & Huber, 2006; Klobučar et al., 2013; Petrusek et al., 2017; Pârvulescu et al., 2019; Lovrenčić et al., 2020; Manfrin et al., 2022).

For both amplified fragments, a reaction volume of 25 µL contained the following: 2.5 mM MgCl2, 0.2 mM of each dNTP, 0.5 µM of each primer, 1U GoTaq HotStart Polymerase (Promega, USA), and about 50 ng of extracted DNA as a template. We set the PCR thermal profile as follows: initial denaturation (3 min at 94°C), followed by 40 cycles (30 s at 94°C, 45 s at 48°C, and 1 min at 72°C or 30 s at 94°C, 30 s at 52°C, and 40 s at 72°C, for COI and 16S, respectively), and a final elongation (5 min at 72°C). Amplicons were visualized by electrophoresis on 1.5% agarose gel and one or both-directional Sanger sequencing (depending on sequence quality) along with purification was provided by Macrogen Inc. (Amsterdam, The Netherlands) using the forward and reverse primers.

We used MEGA11 (Tamura et al., 2021) to manually check the chromatograms for ambiguities and Aliview v. 1.28 (Larsson, 2014) to trim and edit the sequences and to check whether COI sequences translated into amino acids without stop codons. We blast(n)ed (Altschul et al., 1990) all sequences against GenBank (Benson et al., 2012) and used the BOLDSystems v3 identification engine (Ratnasingham & Hebert, 2013) to validate the taxonomic identity of the sequences and to identify new and existing A. torrentium haplotypes.

We obtained original COI sequences from 103 samples and 121 original 16S sequences, of which 89 samples had both sequences (Table S1) and concatenated them. We additionally included ten concatenated sequences (COI + 16S) from Serbia (Klobučar et al., 2013) and 132 unique concatenated sequences recorded outside Serbia (along with their frequencies were needed) that were reported by Lovrenčić et al. (2020) and the studies cited therein (Trontelj et al., 2005; Schubart & Huber, 2006; Klobučar et al., 2013; Petrović et al., 2013; Pârvulescu et al., 2019); all sequences were of sufficient quality (over 95% overlap with our sequences and without ambiguous characters). For outgroups, we used concatenated sequences of A. pallipes (AY667112 for COI; Trontelj et al., 2005; and AY521293 for 16S; Zaccara et al., 2004), Astacus astacus (Linnaeus, 1758; AY667146 for COI; Trontelj et al., 2005; and AF235983 for 16S; Crandall et al., 2000, Pontastacus leptodactylus (Eschscholtz, 1823; AF525228 for COI and KF181959 for 16S; Maguire et al., 2014), Pacifastacus leniusculus (Dana, 1852; AF525226 for COI and AF235985 for 16S; Crandall et al., 2000), and Cambaroides dauricus (Pallas, 1773; AY667147 for COI; Trontelj et al., 2005; and DQ666837 for 16S; Ahn et al., 2006). We aligned the final dataset with MAFFT v.7 (Katoh et al., 2019) trimmed primers and removed poorly sequenced flanking regions to match the sequences from earlier studies (Trontelj et al., 2005; Klobučar et al., 2013; Lovrenčić et al., 2020). The A. torrentium lineage nomenclature is adopted from Lovrenčić et al. (2020).

Phylogenetic analysis

We used jModelTest2 (Guindon & Gascuel, 2003; Darriba et al., 2012) to select the nucleotide substitution model selection under AICc (Akaike, 1974), and MrBayes v. 3.2.6 (Ronquist et al., 2012) and RAxML v 8.2.12 (Stamatakis, 2014) for phylogeny reconstruction (Bayesian inference—BI, and maximum likelihood—ML, respectively). We have reconstructed phylogenies for both markers and for the concatenated dataset. For BI, we used the HKY model with gamma distribution and invariant sites (Hasegawa et al., 1985; Yang, 1994; Steel et al., 2000) to reconstruct an all-terminal phylogeny from the complete set of the A. torrentium sequences (original and published) and outgroups. We simultaneously ran two independent runs of four Markov chains (Mau & Newton, 1997), for 20 million generations, with a sampling frequency of 2000. The starting tree was random. Before combining the trees into a majority rule consensus tree with posterior probabilities (PP), we verified adequate sampling (ESS >  > 500) and convergence with Tracer (Rambaut et al., 2018) and applied a 25% burn-in.

To account for signal saturation in ML bootstrap analysis, we selected two to six sequences per lineage/haplogroup and all outgroup sequences to create a subsample of the total sequence selection (Table S1)—for the concatenated dataset, the analyses had 56 terminals including outgroups. We constructed the phylogenetic tree under the GTRGAMMAX model and assessed the phylogenetic confidence via bootstrap support (BS) with 1000 replicates (Felsenstein, 1985) using the GTRCATX model (Stamatakis, 2014). Alternative topologies tree were investigated by the method of split decomposition (Bandelt & Dress, 1992) as implemented in SplitsTreeCE (Huson & Bryant, 2006) using Neighbor-Net distance transformation (Bryant & Moulton, 2003) and equal angle splits transformation (Dress & Huson, 2004). The concatenated dataset included all A. torrentium haplotypes and A. pallipes as outgroup.

Haplotype network analysis

We assessed population structure and estimated genetic differentiation between A. torrentium populations focusing on those from Serbia and between lineages present therein. To investigate the genealogy between lineages and their haplotypes, we computed a haplotype network using the Median-Joining method (MJ; Bandelt et al., 1999) as implemented in PopART v. 1.7 (Leigh & Bryant, 2015) and used the simple indel coding method (Simmons & Ochoterena, 2000) as implemented in SeqState (Müller, 2005) for gap coding.

Molecular dating analyses

We used BEAST 2 (Bouckaert et al., 2014) for a time-calibrated phylogeny on the same subsample as was used for the ML reconstruction (Table S1). For the substitution model selection, we used a bModelTest expansion in BEAST 2 (Bouckaert & Drummond, 2017), setting the analysis as suggested by Barido-Sottani et al. (2018). As A. torrentium clades are deeply divergent, we used a birth–death process (Gernhard, 2008) for the tree model and the optimized relaxed clock model (Douglas et al., 2021). All clades for which BI and ML analysis revealed high support (PP > 95% and BS > 75%) were treated as a priori monophyletic. We have calibrated the chronogram by applying time constrain priors with normal distribution on four nodes (Table 2). For ages of the first three outgroup splits (calibration of outgroups), we used constrains estimated by Bracken-Grissom et al. (2014) based on the Bayesian Posterior Branching Process (BPBP) fossil calibration model (Wilkinson et al., 2011)—Astacus + Austropotamobius (43.0 ± 13.2 Ma), Pacifastacus + Astacus + Austropotamobius (66.0 ± 13.6 Ma) and Cambaroides + Pacifastacus + Astacus + Austropotamobius (99.0 ± 16.0 Ma); these secondary calibrations were estimated within a comprehensive reconstruction of phylogenetic relationships among major groups of lobster-like decapods using 28 fossil records, among which were also fossils used to calibrate our outgroups. To constrain the age of the ingroup (A. torrentium), we evaluated two contrasting geological calibrations for the split between the APU and its sister clade. Pârvulescu et al. (2019) dated the origin of APU according to the displacement of the Tisza-Dacia microplate (16 ± 1 Ma), while Lovrenčić et al., (2020) presumed that strong freshwater influxes from the merged paleo-Danube and paleo-Tisza delta might have allowed crayfish to survive in northern shallow parts of the Pannonian Lake and even colonize the Apuseni Mountains ~ 5.3 Ma (Table 2). In all our phylogenies, the most basal split within A. torrentium is the one separating APU from the rest, thus the concestor of APU and its sister clade is also the concestor of all the A. torrentium. This topology was supported with relatively high PP values in the BI analysis, while the support in the ML tree was just under statistical significance according to the BS analysis (see Results). To evaluate the suitability of the two geological calibrations, we compared three chronograms—one was calibrated only by calibration of outgroups, while the other two were calibrated using two different sets of geological calibrations. After comparing the time estimates for the A. torrentium concestors, we inferred the final chronogram by calibrating it both using outgroup calibrations and the geological calibration that fitted better (outgroups + geological calibration). We ran all analysis in four independent runs for 100 million generations sampling every 10,000th generation and used the BEAGLE library for parallel computing (Ayres et al., 2012). After verifying MCMC convergence and adequate sampling in Tracer, we combined the independent runs and summarized trees with a 10% burn-in to calculate the maximum clade credibility tree, common ancestor heights, posterior probabilities, and 95% highest posterior density (HPD) intervals.

Table 2 Fossil and geological calibrations used in divergence time estimation

Results

Sequence variability

We recovered a total of 103 sequences for COI and 121 for 16S and confirmed that all individuals were indeed Austropotamobius torrentium. We identified 25 COI (24 newly reported) and 13 16S (eight newly reported) haplotypes and expanded the number of known A. torrentium haplotypes for Serbia from 6 to 30 for COI and from 4 to 16 for 16S. We deposited all new haplotypes to the GenBank (COI: PP391059–PP391082; 16S: PP391134–PP391141; Table S1).

We built the final concatenated matrix using those individuals for which both loci were available, 89 in total (29 unique concatenated sequences). We further expanded the concatenated matrix by adding 132 unique sequences sourced from published records (Trontelj et al., 2005; Schubart & Huber, 2006; Klobučar et al., 2013; Petrović et al., 2013; Pârvulescu et al., 2019; Lovrenčić et al., 2020). After trimming the reads the length of the aligned concatenated matrix was 1,057 bp (1,050 bp without outgroups), of which 582 bp belonged to COI and 475 bp to 16S.

Phylogenetic analysis

ML and BI resulted in mostly congruent topologies for COI, 16S, and the concatenated dataset (Figs. 2, S1, S2); obvious misassignments in the GenBank leading to discrepancies between COI and 16S sequences were corrected if possible or removed (see Table S1 for the final list of sequences). All newly obtained sequences were classified into eight previously reported mtDNA lineages. Our extended mtDNA phylogeny largely confirms the distinct lineages identified in previous studies (Trontelj et al., 2005; Klobučar et al., 2013; Pârvulescu et al., 2019; Lovrenčić et al., 2020) —NCD lineages, APU and the CSE formed well-supported monophyletic clades, while the monophyly of the SB was not supported.

Fig. 2
figure 2

Phylogenetic tree of the stone crayfish inferred by Bayesian inference (BI). Branch lengths represent substitutions, and the scale bar indicates the number of substitutions per site. The symbols on the nodes correspond to BI posterior probabilities (PP) and/or bootstrap support values (BS) when also observed by the maximum likelihood (ML) approach. The ML tree was reconstructed only from a subset of data (Fig. S2). Haplotype names are provided only for those that were detected in the present study. Phylogroups and respective haplogroups that were detected within Serbia are squared. Figure was prepared with Figtree v1.4.4. and edited in Inkscape v1.1

Regarding the relationship between the identified lineages, we demonstrate that APU is likely a sister group to all other A. torrentium lineages. Basal branching of APU was supported by the relatively high PP values of the BI analysis (0.94%), but not in the ML tree, where this topology is suggested but not statistically supported by the BS analysis as its value was 47. Further monophyly support was observed for CSE + SB and CSE + SB + BAN according to both BI and ML, and for GK + ZV according only to the BI tree. Increased sampling also led to some improvement in phylogenetic resolution within SB and CSE. With regard to SB populations from Serbia, we have identified a new well-supported subclade, here named SB-Kolubara (SB-K); demonstrated strong support for the monophyly of the most abundant subclade in Serbia (71 and 1.00 for BS and PP, respectively), here called SB-Serbia (SB-S); and expended the number of haplotypes present in the subclade from eastern Serbia, here called SB-Eastern Serbia (SB-ES), which is a sister group to the SB-KM according to BI (Kefalari-Maras subclade sensu Trontelj et al., 2005). Along the three Serbian subclades and the SB-KM, we have, in the south, identified five additional subclades—SB-RC (SB-Rijeka Crnojevića), SB-V1 (SB-Vardar 1), SB-V2 (SB-Vardar 2), SB-ST1 (SB-Struma 1), and SB-ST2 (SB-Struma 2)—and one individual haplotype represented (1_13) in a basal polytomy with the CSE. Within the CSE populations from Serbia, we expanded the number of known haplotypes belonging to the CSE. Among these are three new haplotypes from southeastern Serbia that, together with three haplotypes from eastern Herzegovina (Bosnia and Herzegovina), form a basal subclade within the CSE, here referred to as CSE-Dinarides (CSE-DI). CSE-DI is characterized by high BS support (74%) and moderate PP support (0.86); also, the two sister clades within this subclade are both well supported. According to the BI, the basal branch to the CSE-DI connects it to the haplotype 64, detected in Turkish Thrace; however, this topology is not statistically supported. While further divisions within CSE are not statistically supported, they point to grouping of haplotypes into four additional subclades that match clusters observed in the MJ network analysis (CSE-S1, CSE-S2, CSE-S3, CSE-DA, see below).

The split decomposition approach revealed a similar pattern with nine lineages that were grouped in three clusters (Fig. 3). Splits on one end of the central trunk of the network lead to the APU and NCD phylogroups, except BAN,while those on the opposite end lead to the SB and CSE, with BAN placed in between the two ends, but closer to the first one. The root is placed at the base of the APU split. In general, branches leading to APU and NCD lineages were comparably long, while branches leading to SB and CSE were shorter; longer length of haplotypes for which only COI sequences were available is an artifact. Haplotypes within SB had shorter branches but formed distinct haplogroups matching those in the phylogenetic trees, including SB-S, SB-ES, and SB-K; some of these subgroups are less apparent at first look due to “artificially” long lineages of haplotypes where only a COI sequence was available. Haplotypes within CSE had even shorter branches than those within SB and are mostly not clustered into haplogroups, except for the cluster merging CSE-DI and CSE-S3.

Fig. 3
figure 3

Split decomposition graph of the Neighbor-Net phylogenetic network analysis of the stone crayfish. The colors represent different phylogroups and their respective haplogroups, while the scale bar represents the nucleotide substitutions per site. Figure was prepared in SplitsTreeCE and edited in Inkscape v1.1

Genetic diversity and phylogeography

MJ networks revealed mostly congurent topologies between the COI, 16S and the concatenated dataset also matching the general structure observed in the phylogenetic trees and network from previous studies. Haplotypes clustered into nine lineages separated by a large number of mutational steps, while the number of mutational steps between neighboring haplotypes within lineages was small in most cases. In Serbia, we observe a regional population structure of A. torrentium with high levels of genetic diversity among populations as well as high levels of genetic divergence between groups of populations/regions (Figs. 1, 4, S3). Of nine known A. torrentium lineages, at least three are present in Serbia—LD, SB, and CSE. The most northern samples of A. torrentium in Serbia are from Fruška Gora in Vojvodina (1—Rakovački potok) and belong to LD, which is otherwise restricted to Lika, Dalmatia (Croatia) and Western Bosnia and Herzegovina. Within this lineage, we detected two new COI haplotypes, raising the number of known COI haplotypes to 11; the 16S sequences belong to the only known haplotype (17 16S). On the network, this lineage spreads mostly along a single axis, with the newly detected haplotypes positioned terminally (Figs. 4, S3). The new LD haplotypes from Fruška Gora are separated by three/four mutational steps from the remaining LD haplotypes. The LD is together with other NCD lineages restricted to the western Balkans, while the SB and CSE have a much wider distribution. In Serbia, the SB has the broadest distribution—its range spreads through the western, central, southern, and southeastern part of the country. Concatenated dataset reveals the existence of five haplogroups within SB and two divergent haplotypes (Fig. S3); three additional haplogroups emerge when looking only at COI sequences, while one of the divergent haplotypes (81_39) from the concatenated dataset clusters with an additional COI haplotype in this network (16S sequences were not available for these samples); altogether, we have identified nine haplogroups and one divergent haplotype within the SB (Fig. 4). Three of these haplogroups are (exclusively) present in Serbia; SB-ES is restricted to eastern Serbia, between Južna Morava and Stara Planina, and was found in the upper part of the Timok drainage (5—Župska reka), Južna Morava drainage (6—Jelovička reka), as well as the Struma drainage of the Aegean basin (15—Brankovačka reka). SB-S, the most widespread haplogroup, dominates central, western, and southern parts of the country, west of Velika, and Južna Morava and was found in Velika Morava (11—Grošnica), Zapadna Morava (7—Užički potok, 8—Crna Kamenica, 9—Čalački potok, 10—Rasina), Upper/Middle Drina (14—Jarevac), as well as Kolubara drainage (12—Crna reka). In the Crna reka locality, this haplogroup is syntopic with the SB-K haplogroup, which we detected only at this location, where it was represented by two distinct COI and two distinct 16S haplotypes combined with three unique concatenated sequences (COI + 16S). Of the six haplogroups recorded outside Serbia, two (SB-V1, SB-V2) are mostly restricted to Vardar River watershed in North Macedonia, two to the Struma River watershed in Bulgaria and Greece (SB-ST1, SB-ST2), one to the Drama region in Greece (ST-KM), one was detected only in a single location in the short Rijeka Crnojevića river in Montenegro (SB-RC), while the divergent haplotype (1_13) comes from the Mesta River in Bulgaria. Lastly, in the CSE lineage, internal substructures are less apparent. Here, the network analysis revealed only five haplogroups separated by a couple mutation steps between them. In Serbia, CSE populations have a disjunct distribution, restricted to southwest and northeast of the country with SB populations in between. In southeastern Serbia, the CSE lineage from the Lim River System (13—Uvac), which belongs to the uppermost part of the Drina watershed, hosts haplotypes that together with those found in eastern Herzegovina, west of the Drina River (Tjenetište, Trnovo; Klobučar et al., 2013) from the CSE-DI haplogroup. The CSE-DI occupies an intermediate position with respect to the remaining CSE haplotypes and the SB haplotypes in the MJ network. Northeastern Serbia, on the other hand, hosts at least two CSE haplogroups—a tributary to the Timok River (4—Rukjavica) hosts the CSE-Danube (CSE-DA) haplogroup, while tributaries to the Danube in the Đerdap Gorge (2—Pesača and 3—Crna reka) host the CSE-Sava 1 (CSE-S1) haplogroup. The CSE-DA is the most abundant CSE haplogroup, and is present in Romania, Hungary, Czechia, Germany, Switzerland, France, and Luxembourg and present also in Croatia (Papuk Mountains) and Austria (outside the Drava and probably the Mura River systems), while the CSE-S1 haplogroup has been rarely detected, besides the Đerdap Gorge, it is known only from a couple tributaries in the Sava River System around Croatia—Slovenia border and in two streams in the uppermost part of the Sava system. The remaining CSE haplogroups, CSE-S2 and CSE-S3, are mostly restricted to the Sava River System above Zagreb. There CSE-S2 is the predominant haplogroup in Slovenia, while CSE-S3 has been detected in a couple streams there and also in the Upper Vrbas System, where it is represented by the most basal haplotype of this haplogroup (88_4).

Fig. 4
figure 4

Median-joining haplotype network of COI sequences. Haplotypes abundances are represented by different sizes of colored circles, with individual colors corresponding to the different phylogroups. The haplotypes found in the present study are in bold. Phylogroups and respective haplogroups that were detected within Serbia are squared. Mutations are given in brackets (red color) besides the lines connecting the haplotypes. Missing or theoretical haplotypes are shown as black dots. Figure was prepared in PopART and edited in Inkscape v1.1

Time-calibrated molecular phylogeny of A. torrentium

The divergence time estimations analysis based on the A. torrentium haplotypes and outgroups resulted in chronograms with similar topology to phylogenetic tree reconstructions. Using only outgroup calibrations, we estimated the time to the most recent common ancestor (TMRCA) of A. torrentium to 15.1 Ma (95% HPD: 7.6–23.5; Table S2). This is largely overlapping with TMRCA estimation (16.0 Ma; 95% HPD: 14.8–17.1) as evaluated from Tisza-Dacia microplate displacement for the split between APU and its sister clade as suggested by Pârvulescu et al. (2019; Table S2), while we observed no overlap with the confidence intervals for (5.2 Ma; 95% HPD: 4.2–6.2) when calibration was done using the paleoecological conditions in the Pannonian Lake as suggested by Lovrenčić et al. (2020; Table S1). According to the outgroups + geological calibration, the split of APU from the remaining A. torrentium dated, to the end of Burdigalian at 16 Ma (95% HPD: 14.8–17.1; Fig. 5; Table S1), was followed by the separation of the ancestral ŽPB + KOR dated to Serravallian at 13.7 Ma (95% HPD: 10.2–16.9) from the rest of A. torrentium. Further splits leading to separation of the ancestral GK + ZV lineage from the rest of A. torrentium dated to 12.8 Ma (95% HPD: 9.0–16.7) and to formation of the LD dated at 11.5 Ma (95% HPD: 7.9–15.6) were not clearly delineated. In the Tortonian, the splits of the GK + ZV and of the ŽPB + KOR occurred around 9.5 Ma (95% HPD: 4.6–16.3 and 4.2–16.0, respectively); however, the monophyly of the second split was not clearly supported. This was also the geological stage when BAN, the youngest of the NCD lineages, was separated from the SB + CSE ancestral lineage, dated at 8.8 Ma (95% HPD: 5.6–12.1). The most recent lineages split in the Messinian at 6.2 Ma (95% HPD: 3.8–8.6), and this was followed by rapid diversification within SB during Early Pliocene (~ 5 to 3.5 Ma), while the separation of CSE-DI from the remaining CSE occurred in Late Pliocene (3.3 Ma; 95% HPD: 1.7–5.2), followed by subsequent splits in Pleistocene.

Fig. 5
figure 5

Fossil and geologically calibrated phylogeny of the stone crayfish as reconstructed from a subset of data created with an optimized relaxed clock model BEAST 2. 95% HPD intervals are shown as value bars at the nodes. Calibration points are indicated by arrows. Median node ages are shown as node labels. Time estimates are given in millions of years. Phylogroups and respective haplogroups that were detected within Serbia are squared (PLIOC—Pliocene, PLEIS—Pleistocene, Burd—Burdigalian, Lang—Langhian, Serrav—Serravallian, Mes—Messinian, Zan—Zanclean, Pi—Piacenzian, G—Gelasian, Ca—Calabrian, C—Chibanian). Figure was prepared with Figtree v1.4.4. and edited in Inkscape v1.1

Discussion

Taxonomical implications from the improved mtDNA phylogeny

Our molecular phylogeny confirms the same distinct mtDNA lineages identified in previous studies (Trontelj et al., 2005; Klobučar et al., 2013; Pârvulescu et al., 2019; Lovrenčić et al., 2020), but with one key difference. While Pârvulescu et al. (2019) and Lovrenčić et al. (2020) suggested that GK was the first lineage to split within A. torrentium, our phylogenies suggest that APU was the first to diverge, with GK and ZV-forming sister clades (Figs. 2, S1, S2)—close relation between the laterx two is also hinted by the microsatellite data (Lovrenčić et al., 2022). Our ‘APU first’ topology was statistically supported by PP values in the BI tree (0.94), whereas bootstrap support in the ML tree was just below 50 (47). In contrast, the ‘GK first’ topology from previous studies (Pârvulescu et al., 2019; Lovrenčić et al., 2020) had high BI values (≥ 0.95) but no statistical bootstrap support. The reason for this discrepancy is not clear, but taxon sampling is of key importance for phylogeneti¸x¸xc inference (Rosenberg & Kumar, 2003), and increasing sampling has been shown to help resolve problematic phylogenies across the animal kingdom (e.g., Kuntner et al., 2023; Robbemont et al., 2023). On the other hand, the differences in basal topology could also be due to the use of different outgroups. Pârvulescu et al. (2019) rooted the tree with A. astacus, P. leptodactylus, and A. pallipes complex, and Lovrenčić et al. (2020) rooted the tree only with A. pallipes. Traditionally, applying multiple outgroups is common in phylogenetic theory (Nixon & Carpenter, 1993; Cheng & Kuntner, 2014; Marić et al., 2017; Zou et al., 2020), but some recent research suggests that distant outgroups may in some cases introduce noise into phylogenetic analysis leading to random rooting (Wheeler, 1990; Grant, 2019; DeSalle et al., 2023). However, in our study, raw genetic distances between outgroups and ingroups were relatively low making random rooting unlikely. The highest values in our study were well below 0.18 (results not shown), whereas DeSalle et al. (2023) demonstrated that phylogenetic accuracy is only compromised when these distances exceed 0.4. In our case, the use of multiple outgroups was necessary because they were essential for the calibration of the molecular clock. Finally, the ‘APU first’ topology is also favored by the BI of ITS2 gene in Lovrenčić et al. (2020) but it is not statistically supported. Expanding the dataset to include additional nuclear (and mitochondrial) markers, or ideally using a genome-wide dataset (e.g., RAD-seq, UCE) that could also detect putative hybridization between lineages, should resolve the relationship at the base of the A. torrentium tree and provide a definitive phylogeny of this species.

Based on highly divergent molecular data and differentiating morphological characters, Pârvulescu (2019) proposed that the crayfish belonging to the APU, endemic to the Apuseni Mountains in Romania, should be treated as a separate species, Austropotamobius bihariensis. However, as the ‘GK first’ topology (Pârvulescu, 2019; Pârvulescu et al., 2019; Lovrenčić et al., 2020) suggests that APU and ZV are sister groups that are well nested within the NCD clade, a taxonomic revision of Pârvulescu (2019) implied that post-revision A. torrentium is paraphyletic. Our newly proposed ‘APU first’ phylogeny, on the other hand, resolves this taxonomic inconsistency as it suggests that APU is basal with respect to all lineages, making A. bihariensis and A. torrentium sister species and recovering monophyly of the latter.

Time-calibrated molecular phylogeny of Austropotamobius torrentium

Molecular dating applied to the improved molecular phylogeny of A. torrentium revealed a large overlap in estimates of divergence times when calibrated across outgroups only (Bracken-Grissom et al., 2014), and when only a geological calibration was used, as proposed by Pârvulescu et al. (2019). According to the calibration of outgroups, the TMRCA of A. torrentium was dated to 15.1 Ma (95% HPD: 7.6–23.5), which is broadly consistent with the confidence interval for the geological age estimate of the Tisza-Dacia microplate displacement (16 ± 1 Ma; Pârvulescu et al., 2019), which has been proposed as a time constraint for dating the concestors of the APU and its sister clade and is also the concestor of all A. torrentium according to our improved phylogeny. Moreover, the same estimates of TMRCA of A. torrentium differ significantly from those proposed by Lovrenčić et al. (2020) for dating the concestor of APU (5.3; 95% HPD: 4.5–6.1). This means that the Tisza-Dacia microplate displacement (~ 16 Ma) may indeed have carried part of the ancestral A. torrentium population to the current location in the Apuseni Mountains, where it remained geographically isolated from the Dinarides through marine and lacustrine phases of the Pannonian Basin (Pârvulescu et al., 2019), giving rise to the APU over time.

Further splitting and formation of the present-day NCD lineages followed relatively quickly; while the topology of most splits leading to NCD lineages has low PP support, our BEAST 2 analysis dated the splits of the ancestral ŽPB + KOR and GK + ZV and the LD lineages to Middle Miocene (Fig. 5). These splits might be associated with the disintegration of the Dinaride Lake System (DLS), a widespread system of Neogene long-living freshwater lakes. The DLS, which, during its maximum extent late Early Miocene (~ 16 Ma), covered large parts of Croatia, Bosnia, and Herzegovina, and parts of northern Montenegro and southwestern Serbia (Mandic et al., 2012), disintegrated in the Middle Miocene, some 13 Ma (de Leeuw et al., 2012). Alternatively, the formation of NCD lineages might have resulted from tectonic activity or other events like the Badenian Salinity Crisis (BSC) that occurred in 13.8–13.4 My, which caused the Central Paratethys to become hypersaline and impacted climate and sea levels (Peryt, 2006; de Leeuw et al., 2010; Palcu et al., 2017). BSC-associated changes affected speciation of regional freshwater invertebrates (Mamos et al., 2016) and vertebrates (Recknagel et al., 2023).

Current distribution of A. torrentium and its lineages reflects the restricted distribution of the NCD populations throughout the Miocene, bounded by a sea or a brackish lake in the Pannonian basin to the east, the Alps to the north, and the External Dinarides to the west. Limited availability of material from Bosnia and Herzegovina (BIH), where the species is also known to be present in the watersheds of Una, Vrbas, and the Bosna river (Trožić-Borovac, 2011; Roljić et al., 2022), complicates the understanding of the south-eastward spread. According to available data, BIH hosts at least CSE and LD haplotypes. Klobučar et al. (2013) detected LD haplotypes near the border with Croatia (Klobučar et al., 2013); however, the presence of this lineage in Fruška Gora (Serbia), northeast to BIH (Fig. 1), indicates a possible wide distribution in this country. Similarly, the disjunct distribution of BAN, which appears restricted to Banija and southern Dalmatia, also hints toward a possible corridor through west BIH. Other samples from BIH belong to the CSE. Samples from southeastern BIH, i.e., Herzegovina (Klobučar et al., 2013) and samples from southwestern Serbia formed a newly identified basal subclade CSE-DI, while the samples from the Central BIH belonged to the prevalent subclade, in which they formed a basal polytomy with two further subclades. Lovrenčić et al. (2020) suggested that the south-eastward spread of A. torrentium into Serbia and the southern Balkans occurred in the Pliocene after the final formation of the Danube River basin and its drainage; however, the presence of the basal CSE-DI sublineage in BIH and Serbia indicates an alternative route for the south-eastward spread of A. torrentium, which could have involved the Late Miocene remnants of the DLS. Lakes Sarajevo–Zenica and Livno–Duvno remained active as a swamp environment/perennial lake draining the surrounding mountain ranges for some time during the post-Middle Miocene (de Leeuw et al., 2011; Mandic et al., 2016), a period roughly coinciding with the estimated split between BAN and CSE + SB (Fig. 5). According to this hypothesis, disintegration of the DLS in late Early Miocene favored allopatric divergence of NCD clades. Further spread of A. torrentium through Serbia and the formation of the SB lineage might have than been associated with the eastern counterpart of the DLS, the Serbian Lake System (SLS). In the broadest sense, this Neogen lake system stretched from Belgrade over the Skopje and Sofia basins to northern Greece (Sant et al., 2018; Neubauer et al., 2020), while the core of the lake was constricted to the western and central Serbia, roughly matching the Zapadna and Velika Morava drainages (Sant et al., 2018). While this system is slightly younger than the DLS (Neubauer et al., 2020), its northern part disappeared before the Late Miocene; however, it “successor” to the south, lakes Kosovo, Metohija, Stanitsi, Katerini, and Skopje remained active in the Late Miocene (Neubauer et al., 2015), and some even persisted through the Pilocene (Metohija, Ptolomais, Strimon; Gemignani et al., 2022; Neubauer et al., 2015). Association of SB clade with the younger SLS is hinted by a more relaxed topology, by smaller molecular distances in comparison with NCD clades and a more relaxed spatial distribution. Alternatively, spread of A. torrentium from the remnants of DLS into the remnants of the SLS could have also take place along the Lake Slavonia, the freshwater successors to the Pannonian Lake that arose in the Early Pliocene (~ 4.5 Ma; Mandic et al., 2015). The final split between CSE and SB and continued fragmentation of the SB, estimated to take place during the latest Miocene and in the Pliocene, along with southward advancement and crossing into the Aegean watershed, were likely shaped by the final disintegration of the SLS in southern Serbia and the complex hydrogeological history of this area. Namely, a presumed Pliocene Morava–Vardar River connection serves as an explanation for the distribution of many “Danubian bound” rheophilic fish (Economidis & Banarescu, 1991; Bănărescu, 2004; Veličković et al., 2023). Moreover, our results agree with a gradual spread of CSE upwards and downwards along the Sava drainage during the Plio-Pleistocen followed by a rapid northward post-glacial (re)colonization of central Europe through the Danube drainage (Klobučar et al., 2013).

In a broader context of crayfish evolution, our improved chronogram dates the split between A. torrentium and A. pallipes to the Late Oligocene (27.9 Ma 95% HPD: 18.7–38.6), which is later than the Middle Eocene (~ 42 Ma) suggested by Pârvulescu et al. (2019), but earlier than the Middle or Late Miocene as suggested by other authors (Klobučar et al., 2013; Jelić et al., 2016; Lovrenčić et al., 2020), who linked the split between A. torrentium and A. pallipes to the Dinarides orogeny, an event that divided the Adriatic-Black Sea drainage. However, our understanding of Dinaric orogeny is limited (Schmid et al., 2008)—it probably began in the Late Eocene/Early Oligocene, and it continued well into the Neogene (Čičić et al., 2008; Schmid et al., 2008). Recent paleogeological evidence from the study of marine terrace formation along the Dinaric coast suggests that extensive and broad uplift of the External Dinarides occurred already during Oligo-Miocene (Balling et al., 2021), a time estimate that is also consistent with radiometric age dating of volcanic and mafic extrusive rocks in Internal Dinarides (22–37 Ma), which serves as a proxy for mantle delamination (Pamić et al., 2000; Schefer et al., 2011). Both estimates also overlap with our molecular dating for the separation of A. torrentium and A. pallipes (27.9 Ma 95% HPD: 18.6–38.6). Alternatively, if the origin of APU were constrained to the Early Pliocene, as suggested by Lovrenčić et al. (2020), this would imply, according to the ‘APU first’ phylogeny, that the separation between the A. torrentium and A. pallipes occurred about 9.4 Ma (95% HPD: 5.6–14.3) and can hardly be associated with extensive uplift of the Dinarides according to the newest estimates. On the other hand, our dating of A. torrentiumA. pallipes split to (Late) Oligocene, could also link this split to the Slovenian corridor, an episodically formed marine passage during the Oligocene (Popov et al., 2004a, 2004b) but also Miocene (Mandic et al., 2015; Ivančič et al., 2018) that connected the Mediterranean and the Central Paratethys.

New insights into the phylogeography of A. torrentium in the Balkans

The complex hydrogeographic history of the western Balkans (DLS, SLS, Paratethys/Pannonian Lake) has shaped the evolution of A. torrentium. Our findings align with Klobučar et al. (2013) and Lovrenčić et al. (2020), suggesting that paleohydrogeographic fragmentation after the regional karstification favored the allopatric divergence long-lasting isolation of the NCD lineages. However, records of LD haplotypes at Fruška Gora indicate a wider distribution of at least some NCD lineages than previously thought. The Fruška Gora population is over 250 km of air distance from the nearest known LD populations in Lika (Croatia) and western Bosnia and Herzegovina. Similarly, the BAN also shows a disjunct distribution with the core of the phylogroup restricted to Banija in central Croatia and an additional population in Dalmatia in southern Croatia separated by about 200 km of air distance (Klobučar et al., 2013). The low genetic distances between the haplotypes of Fruška Gora and its neighbors suggest that the eastward dispersal of LD occurred much later compared to the southward dispersal of BAN. However, due to underrepresentation of samples from Bosnia and Herzegovina in phylogeographic studies, precise conclusions about the distribution of LD, BAN, and other NCD lineages cannot be drawn.

Extended sampling in the central and southern Dinarides could also clarify the phylogeography and evolutionary history of the CSE lineage. Indeed, its basal haplogroup, the CSE-DI, has so far only been detected in three locations, two in eastern Bosnia and Herzegovina (Sutjeska and a stream in the uppermost part of the Bosna River drainage; Klobučar et al., 2013) and one in western Serbia, where we detected this haplogroup in a tributary of the Lim River. The Lim originates in Montenegro and flows through Serbia before joining the Drina below its confluence with the Sutjeska, where one of the two known Bosnian populations also lives. The occurrence of CSE-DI in two separate drainages, the Drina and the Bosna, indicates at least a moderate dispersal ability of this haplogroup and a wider distribution in the central and southern Dinarides. Further sampling might reveal if this haplogroup occurs in the middle Drina tributaries, where it could meet the SB lineage. Extended sampling could also shed light on the relationship between CSE-DI and other CSE haplogroups, such as the basally positioned COI haplotype 88 (Klobučar et al., 2013) in the uppermost part of the Vrbas drainage (Fig. 4, S3).

Serbia is home to two more CSE haplogroups: CSE-S1 in two right tributaries of the Danube and the CSE-DA in a tributary of the Timok. The first haplogroup has so far only been reported from a few watercourses of the Sava drainage along the Croatian-Slovenian border, while the second haplogroup is the most widespread haplogroup in the European context and occurs in the left tributaries of the Danube in Romania. Interestingly, the uppermost part of the Timok drainage also harbors the SB-ES haplogroup, found in southern and eastern Serbia’s Južna Morava and the Struma drainages.

According to our analysis, the most widespread and diverse SB haplogroup was SB-S. Its thirteen COI and six 16S haplotypes occur only in Serbia, where it is mainly distributed in the Zapadna Morava drainage. In addition to the Zapadna Morava, we also found this haplogroup in a tributary of the Rača, which flows into the middle Drina, and in the Crna reka stream in the upper part of the Kolubara drainage, both locations in western Serbia. The Crna reka stream is a unique location as the SB-S and the SB-K meet here in syntopy. This is also the only location where we have detected the SB-K haplogroup.

The presence of a single haplogroup in the Zapadna and Velika Morava drainages indicates a more rigid spatial distribution compared to the southern drainages. Although much smaller than the Zapadna and Velika Morava drainages combined, both the Vardar and Struma drainages harbor at least two haplogroups (SB-V1 and SB-V2; SB-ST1 and SB-ST2; Lovrenčić et al., 2020). This could also be explained by the hydrogeological history of this area, which was much more complex in the south.

Conclusion

The present work offers a new perspective on the phylogeography, phylogeny, and molecular dating of the stone crayfish. It shows that Serbia, a country that played a key role in the pre-glacial southward dispersal, harbors many genetically distinct populations and is home to three distinct evolutionary lineages and several haplogroups. Moreover, we propose a revised phylogeny of the species, according to which the Apuseni lineage from Romania is not nested within the Northern-central Dinarides lineages but arose after the most basal split within A. torrentium. Consequently, there was a strong overlap between two molecular calibration approaches—a fossil-derived calibration of the outgroups to Austropotamobius and the geological calibration that attributes the origin of APU to the Middle Miocene rather than the Mio-Pliocene. The extended sampling and the revised time-calibrated phylogeny suggest that the split between A. pallipes and A. torrentium can be traced back to the late Oligocene, while the dispersal of A. torrentium from the north-central Dinarides to Serbia occurred in the late Miocene and is likely related to the complex and protracted process of disintegration of the Neogene freshwater lakes in southeastern Europe.