Introduction

Salmonidae contains many of the most economically and ecologically important temperate freshwater fishes, which, due to their common tetraploid ancestry (Allendorf and Thorgaard 1984), attract considerable attention in evolutionary research. The subfamily Thymallinae (graylings), considered a distinct family by Skurikhina et al. (1985) and Osinov and Lebedev (2000), is less well known, although the recognized species diversity has increased considerably over the past 15 years (Antonov 2004; Knizhin et al. 2006a, 2007; Knizhin and Weiss 2009). Additionally, increased conservation concerns over European grayling Thymallus thymallus (Dawnay et al. 2011; Weiss et al. 2013; Mueller et al. 2018), as well as its recent whole-genome sequencing (Varadharajan et al. 2018; Sävilammi et al. 2019), have opened the doors to more focused evolutionary and conservation management research (Papakostas et al. 2014; Mäkinen et al. 2016; Huml et al. 2018). However, a pre-requisite to detailed evolutionary research and conservation is a clear understanding of species diversity, systematic relationships and distributions. The last comprehensive phylogenetic analysis of Thymallus (Froufe et al. 2005) became obsolete due to the description of a number of new species (e.g. T. ligericus) (Table 1). A recent annotated checklist of grayling species diversity noted that several species are of questionable status, or require additional research (Dyldin et al. 2017).

Table 1 Overview of Thymallus species following the latest review of the genus by Dyldin et al. (2017), including synonyms listed in Fricke et al. (2020) (see also Dyldin et al. 2017 and references therein), type localities and location of type specimens (where known). Drainage systems of type localities are given in parentheses

The majority of grayling species occur in Asia, occupying some of the world’s largest river drainages. Siberia’s complex paleo-hydrology has shaped the diversity and distribution of grayling throughout Eurasia (Koskinen et al. 2002; Froufe et al. 2003a, 2005). This is illustrated by the phylogeographic patterns in the Altai-Sayan mountain region (Knizhin and Weiss 2009; Weiss et al. 2020a) or the three sympatric species with allopatric origins found in the Bureya River, Amur River drainage (Weiss et al. 2020b). High phylogeographic complexity is also found in Europe within the Danube River drainage as well as the larger Ponto-Caspian basin (Weiss et al. 2002, 2013; Gum et al. 2009; Marić et al. 2012, 2014).

Regional phylogeographic studies have further contributed to understanding within-genus diversity. For instance, Knizhin et al. (2008) demonstrated that different grayling phenotypes in the endorheic basin of Western Mongolia can be phylogenetically assigned to T. brevirostris. Thymallus brevicephalus found in the upper Irtysh River drainage in Kazakhstan is the sister taxon to T. brevirostris (Weiss et al. 2020a). While such studies combined single-gene mtDNA analysis with population genetic and morphological analysis, they may lack phylogenetic resolution due to the limited length of the mtDNA sequence analysed.

Population genetic investigations of grayling across their range support that most, if not all species thus far described are either allopatric or when found in sympatry show little to no introgression (Koskinen et al. 2002; Froufe et al. 2003b; Weiss et al. 2006, 2007, 2020b). Thus, it is likely that mtDNA-based phylogenetic analyses of grayling reflect evolutionary relationships of the entire genome. Mitochondrial genomes (henceforth ‘mitogenomes’) present a significant improvement in resolution over analyses based on one or a few mtDNA genes (Miya and Nishida 2015; He et al. 2018; Boo and Hughey 2019). Low coverage genome-wide short-read sequencing can be used to recover mtDNA even from relatively degraded samples, due to the high mtDNA copy number per cell (Miller et al. 2012; Liedigk et al. 2015). Despite such advantages, to date, only a single application of mitogenomes to grayling phylogeny was carried out (Ma et al. 2016), whereby only eight mitogenomes were sequenced across seven species.

DNA-based species delineation methods have become standard tools for delineating molecular operational taxonomic units (MOTUs) (e.g. Lopes-Lima et al. 2019) or uncovering potentially cryptic species (e.g. Pan et al. 2019) in many organisms, including freshwater fishes (Patil et al. 2018; Berbel-Filho et al. 2018; Corral-Lou et al. 2019). The results of such methods are dependent on the choice of samples and loci (Ritchie et al. 2016; Sukumaran and Knowles 2017) and must be carefully interpreted. However, for a widespread genus like Thymallus, such analyses should allow a standardized evaluation of historically described taxa, which have received limited attention from modern molecular analyses.

In this context, a phylogenetic analysis of the genus Thymallus was carried out, using newly obtained mitogenomes via low-coverage whole-genome sequencing, together with already available mitogenome sequences. Using comprehensive taxon sampling and literature review, we aimed to: (a) produce an updated robust phylogeny of graylings across their global range; (b) evaluate current systematics using a set of species delineation methods; (c) provide an overview of the distribution and conservation status of each species; (d) reconstruct ancestral biogeographic patterns and (e) provide an up-to-date taxonomic overview. More broadly, we aim to provide a baseline for future research on evolutionary patterns, taxonomic diversity, conservation and biogeography of grayling throughout the world.

Methods

DNA extraction and sequencing

Whole genomic DNA of 47 grayling representing species across their global range was extracted from fin-clips stored in 96% ethanol using a high-salt protocol (Sambrook et al. 1989) (Table S1, Electronic Supplementary Material). Sample choice was influenced by the availability of published mitogenomes, aiming for a data set covering the global diversity of the genus. Included are most species listed in Dyldin et al. (2017) and the newly described T. ligericus (Persat et al. 2019) (Table 1). For each sample, Illumina paired-end libraries (TrueSeq DNA PCR-Free kit) were prepared and ~ 2 Gb of raw paired-end 150 bp (PE150) reads were sequenced on an Illumina Platform at Macrogen (Seoul, South Korea) and Novogene (UK, Europe).

Mitogenome assembly, annotation and characterization

Whole mitogenome assemblies were performed using a seed and an extend algorithm in NOVOplasty v2.7.2 (Dierckxsens et al. 2016), and the baiting and iterative approach in MITObim v1.9.1 (Hahn et al. 2013) following the authors’ guidelines and using default parameters. Additionally, six mitogenomes of T. thymallus and eight from T. ligericus were produced with the long-range PCR protocol described in Denys et al. (2020). The 61 new mitogenomes were annotated with MitoAnnotator (Iwasaki et al. 2013) via the web-interface (http://mitofish.aori.u-tokyo.ac.jp/, last accessed 28.06.2019). Annotations were converted to standard feature table format using a custom script (https://github.com/chrishah/MitoFish2tbl) and deposited to GenBank (MT062993-MT063053) and validated by comparison to published data (Table S1).

Phylogenetic analyses

The sequenced mitogenomes were aligned in MAFFT (Katoh and Standley 2013) together with mitogenomes retrieved from GenBank for 31 Thymallus spp. and three outgroup taxa (Table S1). For subsequent analyses, the mtDNA control region (CR) was removed due to several repeat motifs. The best partition scheme for each gene or group of genes was estimated using PartitionFinder2 v2.1.1 (Lanfear et al. 2012). Maximum likelihood analyses (ML) were performed with the program RAxML-HPC2 Workflow v8.2.10 (Stamatakis 2016), using the general time reversible substitution model, with a gamma parameter and proportion of invariable sites (GRT + I + G) and 1000 bootstrap replicates. Bayesian phylogenetic inference (BI) was carried out in MrBayes v3.2.6 (Ronquist and Huelsenbeck 2003), using the best substitution models previously obtained in PartitionFinder2. Two independent runs were carried out (107 generations, one tree sampled every 100 generations), with default chains. PartitionFinder2, MrBayes and RAxML were run in CIPRES Science Gateway (Miller et al. 2010). Tracer v1.7.1 (Rambaut et al. 2018) was used to assess the effective sample size values (ESS) and determine the appropriate burn-in.

Species delineation methods

Two distance-based and one tree-based method were applied to determine molecular operational taxonomic units (MOTUs): the cluster tool implemented in BOLD (Ratnasingham and Hebert 2007), the Automatic Barcode Gap Discovery (ABGD) (Puillandre et al. 2012) using both an initial ABGD(i) and a recursive ABGD(r) partition, and an improved multi-rate version (mPTP; Kapli et al. 2017) of the Poisson Tree Processes (Zhang et al. 2013). The BOLD method accepts COI sequences only, but ABGD and mPTP were applied to the mitogenome dataset. For BOLD, COI sequences were retrieved from aligned mitogenomes and uploaded on the BOLD platform. ABGD was run online (https://bioinfo.mnhn.fr/abi/public/abgd/abgdweb.html; last accessed 20.02.2020), on the same dataset used for the phylogeny. The relative gap width was set to 1.0 and uncorrected p distances were used as a substitution model; remaining parameters were set as default. Finally, the mPTP method was applied to the Bayesian tree (without outgroups) using an online platform (https://mptp.h-its.org/#/tree; last accessed 23.01.2020). Samples were also assigned to species based on a combination of phenotypic identification, type locality, a-priori range knowledge and GenBank annotations. Uncorrected p distances were calculated among species or MOTUs for each delineation method (putative species, BOLD, ABGD (i), ABGD(r), mPTP using MEGA v10.0.5 (Kumar et al. 2018)) and frequency histograms of each were made using SPSS v26. For these calculations, four taxa listed in Table 1 were excluded due to the lack of evidence supporting their biological meaningfulness and/or our inability to assign a sample to them due to ambiguity concerning their type localities. Three of these taxa (T. mertensii, T. pallasii, T. signifer) are herein grouped within T. arcticus sensu lato (s.l.) and all are discussed in the results and discussion as well as Appendix 1 (Electronic Supplementary Materials). MOTU delineation was used as one line of potential evidence (sensu de Queiroz 2007) to support the existence of species. MOTU support by multiple algorithms together with population genetic and phylogeographic data, phenotypic distinction and ecological niche occupation were used to infer a taxon’s biological recognition. Lack of MOTU support was not necessarily used to argue against species recognition if multiple lines of other biological evidence supporting a species status exist.

Time-calibrated phylogeny

A time-calibrated phylogeny was produced using BEAST v1.10.4 (Drummond and Rambaut 2007; Suchard and Rambaut 2009) in CIPRES science gateway (Miller et al. 2010). Mitogenomes of eight individuals from Salmoninae and Coregoninae were added to the dataset as outgroups needed for the calibration scheme. The best partition scheme was selected as mentioned above. BEAST analysis was done with unlinked substitution and clock models, and a linked tree model, using uncorrelated relaxed molecular clock priors with log-normal distributions (Drummond et al. 2006). The birth-death speciation process (Gernhard 2008) was chosen as a tree prior, given the dataset comprised intra- and inter-specific relationships. One calibration point was set at 50 MY for the most recent common ancestor (MRCA) of Salmonidae, using the fossil †Eosalmo driftwoodensis Wilson, 1977 as a minimum time constraint for the family (Crête-Lafrenière et al. 2012; Lecaudey et al. 2018). Prior parameters were implemented following Lecaudey et al. (2018) (lognormal distribution, offset: 50, mean: 10, SD: 1). Another calibration point was set using the expansion of T. baicalensis into Lake Baikal (Koskinen et al. 2002), adjusted to 0.13 MY (Normal distribution, mean: 0.12, SD: 0.1), following the most recent dating of the ‘Angara breakout’ (Arzhannikov et al. 2018).

A third calibration point was set at 7.6 MY (stem dating; lognormal distribution; mean:7.6; SD:1.5) based on fossil remains of †Thymallus latisulcatus Rückert-Ülkümen & Kaya, 1993 found in Yalova (otolith used for species description) and Yalakdere, Turkey (Rückert-Ülkümen and Kaya 1993). The fossils were located in the strata corresponding to the Khersonian-Maeotian transition, whose date was recently revised to 7.6 MY (Lazarev et al. 2020; Palcu et al. 2019). Since this fossil may or may not correspond to a direct ancestor of the extant European lineages, the calibration was placed on the stem. The substitution rate prior for each partition was set with a normal distribution and 1%/MY (average 0.01 and 0.004 SD) following the mtDNA molecule calibration for salmonids (Smith 1992) and the reported lower bound of the T. baicalensis expansion (Koskinen et al. 2002). Additional Quaternary fossils of T. arcticus have been reported from North America (Cumbaa et al. 1981; McAllister and Harington 1969), but these were not included, since it was not possible to determine with certainty where these fossils would fall in the phylogeny. They could represent an ancestor of all North America populations, or only one of the lineages, or be placed within one of the lineages. Since the North American lineages are not monophyletic, using these fossils to delimit a minimum or maximum age for the split of both NA lineages would also require assuming where the split took place, and such an assumption might be less than informative. Analysis was performed with three 30 million MCMC iteration runs, sampling every 3000 runs. Burn-in and run convergence (ESS > 200) were determined using Tracer v1.7.1 (Rambaut et al. 2018). Independent runs were joined in LogCombiner v1.10.4 and the final tree produced in TreeAnnotator v1.10.4 (both in the BEAST package) and formatted in FigTree v1.4.4 (Rambaut 2012).

Species distributions and taxonomic review

Approximate species distributions were estimated based on both published and unpublished information (see Fig. 1). The area of occupancy (AOO) was calculated using ArcMap v10.7 (ESRI 2019) and hydrological network data from HydroAtlas v1.0 (Linke et al. 2019), as the total hydrological network area of lakes plus rivers larger than Strahler level three, a more realistic measure than a 2 × 2 km grid overlay (see Gomes-dos-Santos et al. 2019). Based on literature review, and present results, we comment on the biological status of each taxon, and review their taxonomic and conservation status based on IUCN criteria.

Fig. 1
figure 1

Map showing the global distribution range of Thymallus species. Information on sampling sites and species is given in Table S1. Numbers in the map refer to known contact zones of the following species: 1 = T. arcticus s.l. and T. baicalensis in the lower Enisei River; 2 = T. arcticus s.l. and T. baicalolenensis in the lower Lena River; 3 = T. nikolskyi and T. baicalensis in tributaries of the upper Ob River; 4 = T. baicalolenensis and T. baicalensis in tributaries of Lake Baikal; 5 = T. grubii, T. tugarinae and T. baicalolenensis (not all found in contact throughout this zone) in the upper and middle Amur drainage (including its tributaries such as the Ingoda, Argun and Zeya rivers); 6 = T. flavomaculatus, T. tugarinae and T. baicalolenensis in the Uda River drainage, 7 = T. burejensis, T. grubii and T. baicalolenensis in the upper Bureya River; 8 = T. tugarinae and T. flavomaculatus in tributaries of the lower Amur River

Historical biogeography and ancestral area reconstruction

To reconstruct the ancestral ranges of extant Thymallus diversity, we performed a Bayesian phylogeographic inference in discrete space using the Bayesian Stochastic Search Variable Selection (BSSVS; Lemey et al. 2009) implemented in BEAST. Considering both the current and paleo-hydrological drainage patterns, we considered seven biogeographic regions: Europe (west of Urals), Western Siberia (Kara Sea basin); East Siberia (north-flowing rivers, and Bering Sea); East Asia excluding the Amur (Okhotsk, Japan, Yellow and East China sea drainages); Amur River drainage; North America and the endorheic drainages of Western Mongolia. We used the same dataset, models and prior settings as in the dating analysis. BSSVS parameters were as follows: symmetric substitution model, strict location-trait clock and an exponential distribution for the location rates prior (mean = 1, offset = 0).

Results

Mitogenome analyses

Low-coverage genome sequencing recovered mitogenomes for all 47 samples analysed. With on average 18 million genomic reads per sample, the number of mitochondrial reads ranged from almost 3 to more than 50 thousand per sample for an organelle coverage ranging from 25x to 390x (Table S2). Reads and circularized molecules were deposited on GenBank (MT062993-MT063053, BioProject: PRJNA604892) (Table S2). The mitogenomes included 13 protein-coding genes, 22 tRNAs, 2 rRNAs and the CR (see Fig. S1, Electronic Supplementary Material). Average nucleotide composition and gene order were consistent across all individuals (Fig. S1).

Phylogenetic analyses and species delineation

ML and BI analyses showed the same topology; thus, only the Bayesian phylogenetic tree is represented (Fig. 2). The tree depicts three main groups of species, with the first split separating East Asian species (excluding T. burejensis) from all others, and the next split separates the three accepted European species from those in Siberia and the Altai, including T. burejensis. Pairwise divergence among a priori defined species ranged from 0.3% for T. nigrescens and T. baicalensis, to 5.5% between T. tugarinae and T. aeliani (Table S3). Net uncorrected p distances among all MOTUs revealed a bi-modal distribution, with a minor mode at 0.1–0.2% and a major mode centred on approximately 5% sequence divergence (Fig. S2). Congruence between well-supported clades, all delineation algorithms and a priori species assignment occurred for T. baicalolenensis, T. burejensis, T. aeliani and T. ligericus. Thymallus baicalensis was not monophyletic due to the placement of T. nigrescens within T. baicalensis. Thymallus tugarinae was also not completely congruent due to a divergent haplotype from the Kievka River, which drains to the Sea of Japan (KY078218).

Fig. 2
figure 2

Bayesian phylogenetic reconstruction based on whole mitogenome dataset (excluding the CR). Branch support values represent Bayesian posterior probabilities (above nodes) and maximum likelihood values (below nodes). An asterisk symbolizes that both values were ≥ 90%. To the right of the tree shown in grey bars are the results of three MOTU delineation methods. Each bar represents one MOTU: 12 for BOLD, 16 for mPTP and 21 for ABGDa (indicates both initial and recursive partition). Taxa with a smaller font positioned right of the MOTU bars represent names in use for which there is no biological support and thus may represent junior synonyms. Species are coloured according to Fig. 1

The most conservative delineation algorithm was BOLD, recognizing 12 MOTUs. The two ABGD schemes recognized 21 MOTUs, and the mPTP recognized 16 MOTUs. Paraphyletic relationships were inferred for T. thymallus and T. flavomaculatus. The former paraphyly results from a well-supported clade representing Alpine haplotypes from the upper Danube (Weiss et al. 2002, 2013) resolved as the sister clade (1.5% divergent) of T. ligericus and delineated as one (BOLD) or two (mPTP and ABGD) MOTUs. The node containing T. grubii, T. flavomaculatus and T. yaluensis represents the common ancestor of T. flavomaculatus. The earliest split and most divergent clade within T. arcticus s.l., delineated as a MOTU by mPTP and ABGD, was represented by haplotypes (MT063015-017) from north-eastern British Columbia in Canada (Table S1). No additional Arctic taxa (e.g. T. pallasii, T. mertensii or T. signifer) (see Appendix 1) could be recognized by MOTU delineation, though, as noted above, there is ambiguity concerning the assignment of these taxa to a location or sample in our data set. Phylogenetic analysis resolved the sister species T. brevirostris and T. brevicephalus (0.4% divergent), but not MOTU delineation. Similarly, phylogenetic analysis resolved the sister species T. nikolskyi and T. svetovidovi (0.6% divergent), but only the mPTP algorithm delineated them as distinct MOTUs.

Time calibration and ancestral range reconstruction

The MRCA of Salmonidae was dated at 59.3 MY, at the end of Palaeocene (Fig. 3) while that for Thymallus was 9.11 MY, at the end of Miocene. The MRCA of three East Asian species, T. tugarinae, T. flavomaculatus and T. grubii, was estimated at 8.42 MY also in the late Miocene, similar to the split between European species and all others (7.54 MY), whereas the MRCA of all European species was estimated at 3.86 MY corresponding to the mid-Pliocene. The next oldest node (6.35 MY) represented the split between T. burejensis and remaining species. The estimated MRCA of T. arcticus s.l. and T. baicalolenensis 5.01 MY and that of all T. arcticus was 2.39 MY, in the Pliocene and early Pleistocene, respectively. The splits between two pairs of allopatric species in the Altai-Sayan mountain region [T. brevirostris and T. brevicephalus (0.85 MY), and T. svetovidovi and T. nikolskyi (0.77 MY)] were both dated to the mid-Pleistocene. The Amur River drainage was the most likely ancestral location of origin of the extant grayling diversity, with much higher support than every other option (Fig. 3). A single colonization of Europe from Western Siberia was predicted, but multiple exchanges may have occurred among the other considered regions, exemplified by the three-basin distribution of the young T. baicalolenensis, or two predicted ancestors of North American lineages over 1 MY apart.

Fig. 3
figure 3

Time-calibrated phylogeny and biogeographic analysis of Thymallus spp. inferred from the whole mitogenome (excluding the CR). Three fossil calibration points (panel a) and a biogeography-based substitution rate were used (details in text). Ancestral ranges were reconstructed as a discrete trait following the phylogeographic diffusion BSSVS method; nodes and branches are colour-coded according to region. Hypothesised dispersal events are shown in panel b. Potentially relevant paleo-climatic events such as the Maximum Ponto-Caspian extent (MPC), the following Fluvial Plains Period and the Eurasian Glacial Cycles are represented at the bottom. MSC, Messinian salinity crisis; KMT, Khersonian-Meotian transition

Discussion

Species delineation

Our results strongly support the biological validity of at least 13 grayling species globally, plus two additional species tentatively all but one supported by at least one MOTU delineation approach, the exception being T. nigrescens. Microsatellites and single mtDNA sequences allow no distinction between T. nigrescens and T. baicalensis (Koskinen et al. 2002; Kaus et al. 2019). Next-generation sequencing RAD data supported a 15–30 thousand year divergence between the two taxa (Roman et al. 2018). The taxon is morphologically distinct from T. baicalensis based on a small dorsal fin and a higher number of gill rakers (Knizhin and Weiss 2009; Olson et al. 2019), and thus, its occupation of a distinct ecological niche supports its status as a distinct species despite relatively low genetic divergence.

Mean pairwise distances among species (3.7%) conformed to minimum thresholds applied in other studies. The majority (> 90%) of vertebrate species differ from their nearest neighbour by > 2% (Avise et al. 1999) or > 3% (Hebert et al. 2004) using standard mtDNA gene markers. Such thresholds have been noted in many data sets involving fishes (e.g. Ward 2009; April et al. 2013). Our divergence values may be comparatively low as we use uncorrected p distances instead of model-corrected distances (see Srivathsan and Meier 2012), salmonid mtDNA exhibits a relatively slow substitution rate (ca. 1%/MY) (Smith 1992; Froufe et al. 2005) and our calculations include non-protein regions of the mitogenome (excl. the CR). Higher values are obtained when using protein-coding genes alone (Fig. S3). Besides the baicalensis/nigrescens species pair, two other pairwise distances in our data under these thresholds are between sister species pairs in the Altai-Sayan mountain region (T. brevirostris and T. brevicephalus; and T. nikolskyi and T. svetovidovi). These species pairs span allopatric drainages subject to catastrophic paleo-hydrological flooding (Weiss et al. 2020a), which may have played a role in their cross-basin colonization history.

Five additional valid Thymallus names (T. yaluensis, T. brevipinnis, T. signifer, T. pallasii and T. mertensii) are supported neither by MOTU delineation nor with existing morphological or genetic data (Fig. 2) (see Appendix I). Two phylogenetically well-supported clades, also supported by one or more delineation approaches, have no species assignment. The first of these makes T. thymallus paraphyletic and consists of upper Danubian haplotypes, forming a sister clade to T. ligericus. This relationship was not recovered in previous phylogenetic analyses based on mitochondrial CR (Weiss et al. 2002, 2013; Marić et al. 2014), underscoring the higher resolution of the whole mitogenome and/or the importance of balanced taxon sampling, but also the need for additional investigation (see Appendix I for additional discussion). The second of these unassigned clades involves haplotypes (MT063015–017) of T. arcticus s.l., representing a lineage that is believed to derive from the Nahanni glacial refuge (Stamford and Taylor 2004). This mtDNA lineage was among three lineages described in North America (Stamford and Taylor 2004), whereby two colonization events into North America were proposed. The possibility that these haplotypes could represent a distinct taxon must be better explored with population genetic analysis.

The paraphyly of the T. flavomaculatus clade in our tree may be based on the complex phylogeographic structure of both T. flavomaculatus and T. grubii combined with our limited sampling. Both T. flavomaculatus and T. tugarinae are found in tributaries of the Amur River drainage, but also coastal drainages flowing directly into the seas of Japan and Okhotsk (Antonov and Knizhin 2014) promoting fragmentation and divergence (Appendix I).

Biogeography

Species diversity and range size in Thymallus spp. reveal a latitudinal gradient with few species above the Arctic Circle and increasing diversity (Spearman’s rank − 0.819, P < 0.05) and decreasing range size (Spearman’s rank 0.785, P < 0.001) with decreasing latitude (Fig. 1). Ancestral trait reconstruction supports the present Amur River drainage as the place of origin for extant diversity (Fig. 3), with lineage sharing among coastal draining rivers reflecting drainage re-arrangements and dynamics of the Pleistocene (Grosswald 2009; Antonov 2012; Antonov and Mikheev 2016). Europe’s colonization by Thymallus took place somewhere in the Pliocene, between the split from extant Asian lineages in the late Miocene, and the European lineage crown age in the mid-Pliocene. We note that the rough estimates of the timing of these events are affected by the fossil calibration point placed on the stem leading to the European radiation of the genus, derived from the 7.6 MY-old fossil remains of †T. latisulcatus. A previous analysis lacking this calibration point resulted in ages about 20% younger, but still falling in the same epochs described here, and provided even higher support for an Amur ancestral range (all nodes > 80%). Higher connectivity among Arctic Ocean drainages is illustrated by at least two colonisations of North America from Eastern Siberia and is analogous to Holarctic or near Holarctic distributions in Pungitius (Guo et al. 2019), Esox (Skog et al. 2014), and Lota (Van Houdt et al. 2005). Dispersal into Western Siberian rivers seems to have taken place multiple times, most recently from Eastern Siberia and the Amur, after a much more ancient event. The suggested colonization of the endorheic basin in Mongolia from East Siberian basins evokes the still unconfirmed proposals of a Gobi-Amuran proto-glacial drainage system (Grosswald 1998). Coastal rivers of East Asia retained connectivity with the Amur during the Pleistocene, reflected by the absence of a geographically coherent monophyletic group. However, a Miocene-Pleistocene split and posterior isolation between the extant diversity in both basins cannot be excluded, due to the long un-split branches from Miocene to Pleistocene (Fig. 3). Overall, the biogeographic patterns of Thymallus are concordant with scenarios proposed for several taxa. Cobitid fishes reveal an early radiation in eastern Asia, and colonization of Europe from East Asia in the late Miocene (Šlechtová et al. 2008; Perdices et al. 2016). A colonization from northeast Asia to Europe is proposed for Carassius (Rylková et al. 2013), and similar colonization of Rhodeus (via vicariance) is inferred for the Pliocene (Bohlen et al. 2006), supported by the disjoint distribution and hypothesised vicariance of Margaritifera (Bolotov et al. 2016) a parasitic bivalve of Rhodeus. These studies, however, do not provide explicit paleo-hydrological pathways or events that may have facilitated these patterns. While some of these hypothesised colonization events are pre-Pleistocene, our knowledge at least of Pleistocene paleo-dynamics across Russia reveals numerous events of paleo-hydrological connectivity (via large paleo-lakes and mega floods) across Siberia and through the Ponto-Caspian region into Europe (Komatsu et al. 2016). These events have already served hypotheses concerning the phylogeographic structure of Thymallus, Hucho and Brachymystax across their Siberian range (Koskinen et al. 2002; Froufe et al. 2003a; Weiss et al. 2020a). Nonetheless, much work remains in matching the timing of specific events of dispersal and vicariance with specific paleo-events promoting cross-drainage connectivity and isolation.

The following section provides a very brief overview of the 15 focal taxa of our phylogenetic and MOTU analyses. Included is a general description of their geographic range and phylogenetic position as well as our opinion on each taxon’s validity as a species (i.e. regarding its biology) and its suggested conservation status based on IUCN criteria. For more detailed comments on these taxa and those for which we do not currently find biological support, or cannot assign to a sample, see Appendix I.

Amur and east Asian taxa

Thymallus tugarinae—  Lower Amur grayling

Occurs in the middle to lower Amur River drainage and several coastal draining rivers including northwestern Sakhalin Island (Fig. 1). Highly divergent from its sister clade containing T. grubii (4.1%) and T. flavomaculatus (4.2%). Easily distinguished from sympatric T. flavomaculatus using meristic or morphological characters (Knizhin et al. 2007); reciprocally monophyletic with other Amur graylings (Fig. 2) (Froufe et al. 2003b, 2005); reproductive isolation supported (Froufe et al. 2003b). Shows significant phylogeographic structure across its range (Froufe et al. 2003b).

Taxonomic validity

A species reproductively isolated and easily distinguished from the sympatric T. flavomaculatus (see T. flavomaculatus below).

Conservation remarks

Area of occupancy exceeds 2000 sq. km. Suggested Global Status: Least Concern.

Thymallus flavomaculatus— Yellow-spotted grayling

Occurs in some coastal rivers draining into the seas of Japan and Okhotsk, as well as some lower Amur River tributaries (Fig. 1); overlaps considerably with T. tugarinae (Froufe et al. 2003b; Antonov and Knizhin 2011). Easily distinguished from T. grubii and sympatric T. tugarinae based on a characteristic yellow-orange spot located in the posterior area of the dorsal fin (Knizhin et al. 2006a).

Taxonomic validity

The taxon was originally described as a subspecies of T. grubii, widely distributed in the Amur River drainage. Whether or not the taxon is treated as a species or a sub-species is beyond the scope of this manuscript. In our analysis, T. flavomaculatus is paraphyletic (see additional comments below for T. grubii).

Conservation remarks

Coastal populations are more threatened by anthropogenic changes and overfishing than interior populations. Some range fragmentation is present and the area of occupancy may be as little as 400 sq. km. Suggested Global Status: Near threatened.

Thymallus grubii— Amur grayling

A small-sized grayling, easily diagnosed based on body and dorsal-fin colouration; occurs throughout the middle to upper Amur River drainage (Fig. 1). Significant phylogeographic structure is reported (Froufe et al. 2003b; Knizhin et al. 2004; Weiss et al. 2020b). Occurs in sympatry with T. burejensis and T. baicalolenensis in the upper Bureya River, where reproductive isolation is strong but not complete (Weiss et al. 2020b), and in sympatry with T. tugarinae in the lower Zeya River and Ingoda River, and with T. baicalolenensis in the upper Zeya River and upper Ingoda River (Antonov and Mikheev 2016).

Taxonomic validity

A species diagnosable from all other grayling (Weiss et al. 2020b). Could be also treated as the nominal species of a three-taxon aggregate, consisting of T. (grubii) grubii, T. (grubii) flavomaculatus and T. (grubii) yaluensis (see Article 6.2 in the International Commission on Zoological Nomenclature 1999) (see also Appendix I).

Conservation remarks

Occupies a relatively large range and several relatively pristine river systems. Suggested Global Status: Least Concern.

Thymallus burejensis— Bureya grayling

A robust-bodied grayling, endemic to the middle and upper reaches of the Bureya River. Occurs in sympatry in the upper Bureya River with T. baicalolenensis and T. grubii, and shows relatively strong (albeit not complete) reproductive isolation (Weiss et al. 2020b). Mitogenomic distances range from 3.0 to over 4.9% between T. burejensis and all other congeners.

Taxonomic validity

A species displaying significant reproductive isolation with two other grayling taxa.

Conservation remarks

The Bureya River is over 700 km in length but the mid-to-lower reaches have been heavily impacted by hydropower, eliminating or fragmenting portions of the specie’s range. Currently, its area of occupation does not exceed 100 sq. km. Suggested Global Status: Endangered.

European taxa

Thymallus aeliani —Adriatic grayling

Occupies the middle to upper reaches of the Soca River in Slovenia and tributaries of the Po and Adige rivers in Italy (Fig. 1). Divergent from T. thymallus (2.7%) and T. ligericus (3.6%) within the clade of European grayling taxa (Fig. 2). Meraner et al. (2014) reported significant regional structure in the Adige River drainage.

Taxonomic validity

A species based on its deep divergence to all other grayling and allopatric distribution in Adriatic draining rivers.

Conservation remarks

River engineering measures, hydropower expansion and water pollution are among widespread threats that have reduced at least 50% of the species range; introgression with non-native lineages is a major threat to T. aeliani and there are few pure genetic populations left (Sušnik et al. 2004; Meraner et al. 2014). The area of aquatic occupancy may be as little as 100 sq. km. Suggested global status: Endangered.

Thymallus thymallusEuropean grayling

Widely distributed (Fig. 1); until recently included all European stocks of grayling. They, along with T. aeliani and T. ligericus, are the only Thymallus species with a sub-terminal mouth. Significant phylogeographic structure throughout Western Europe (Weiss et al. 2002; Gum et al. 2009), and from the western Balkans and Caspian Sea catchment (Marić et al. 2012, 2014). The taxon is paraphyletic due to the systematic relationship to both T. ligericus and its sister clade of upper Danubian haplotypes.

Taxonomic validity

A species distinguished from all Asian grayling by a subterminal mouth and strict long-time allopatry to T. ligericus and T. aeliani.

Conservation remarks

Currently listed as a species of Least concern. Locally, and especially in the southern portions of its range, population declines or extinctions are widespread (see Weiss et al. 2013), leading to several endangered assignments at national levels. Suggested Global Status: Least Concern.

Thymallus ligericus— Loire grayling

Recently described endemic of the upper Loire River drainage in France (Persat et al. 2019) (Fig. 1). Populations remain genetically pure despite 50 years of stocking with foreign strains (Persat et al. 2016), suggesting they either outcompete foreign lineages (i.e. T. thymallus) or display reproductive isolation. Morphologically distinguished from T. thymallus by a more pointed snout, more inferior mouth and profuse spotting (Persat et al. 2019). In our analysis, they appear as a shallow, monophyletic clade, 1.5% divergent from T. thymallus haplotypes from the upper Danube drainage and 2.2% divergent from all T. thymallus samples. The zoogeographic origins of this species in the Loire basin are unknown.

Taxonomic validity

A species based on its morphological and genetic distinction and long-term isolation (2 MY or more) from grayling of adjacent river drainages (Rhine and Rhône) (Persat et al. 2016).

Conservation remarks

Populations are in decline and there are concerns of decreasing water flows and rising water temperatures. Its area of occupancy may not exceed 35 sq. km, but at least six or more fragmented populations exist. Suggested Global Status: Vulnerable.

Altai, Siberian and North American taxa

Thymallus arcticus s.l.—Arctic grayling

Occurs from just east of the Urals in Russia to Hudson Bay, Canada; a disjunct population in the Big Hole and Red Rock river drainages in Montana, USA (Fig. 1). In our analysis, T. arcticus s.l. is 2.6% divergent from its sister taxon T. baicalolenensis. Phylogeographic structure across the Arctic is weak; for example, the haplotype MT063012 near the type locality (Sob River, Ob) groups closely with haplotypes from the Lena River drainage and the Okhotsk Sea catchment in far eastern Russia. Haplotype MT063010 from the presumed type locality of T. pallasii (see Dyldin et al. 2017) (Appendix I) in eastern Siberia is intermediate between most North American haplotypes and those from Kamchatka, which some authors assign to T. mertensii.

Taxonomic validity

A species based on its clear genetic divergence to other taxa, morphological distinctiveness especially in the dorsal-fin size and colouration (albeit with regional variation), and confirmed reproductive isolation to T. baicalensis (Weiss et al. 2007) and T. baicalolenensis (Weiss et al. 2006). See comments in Appendix I concerning potential recognition of additional taxa, herein treated as T. arcticus s.l.

Conservation remarks

The global population of T. arcticus s.l. is listed by the IUCN as a species of Least concern (LC). Numerous reports exist of population size declines for the species locally, both in North America and Russia. Suggested Global Status: Least Concern.

Thymallus baicalensis —Baikal black grayling

Occurs throughout the Enisei River drainage including Lake Baikal and its major tributary the Selenga River (Fig. 1); also in some right-hand tributaries of the Ob River drainage (Mrassu and Kabyrza rivers) represented by haplotypes MT063026 and MT063025 (Fig. 2). Displays a net divergence between 1.6 and 1.9% to four species comprising its sister clade (T. nikolskyi, T. svetovidovi, T. brevicephalus and T. brevirostris). Based on Koskinen et al. (2002), multiple samples throughout Lake Baikal were in Hardy-Weinberg Equilibrium reflecting a single taxon occupying Lake Baikal; this inference is also supported by morphological and genetic data from Knizhin et al. (2006b). All species delineation approaches allocated both T. baicalensis and T. nigrescens as a single MOTU.

Taxonomic validity

A species based on clear genetic divergence to other taxa, distinct dorsal-fin colouration and multiple contact zones with little to no gene flow with other species (T. arcticus, T. baicalolenensis) (see Knizhin et al. 2006b; Weiss et al. 2007). Thymallus brevipinnis is suggested to be a synonym of T. baicalensis (Appendix I).

Conservation remarks

Local population declines and extinctions reported; threats include hydropower development, overfishing and pollution. However, the species has a very large distribution range and occupies many habitats that are in pristine or near-pristine condition. Suggested Global Status: Least Concern.

Thymallus baicalolenensis —Baikal-Lena grayling

Small-bodied grayling occurring throughout the Lena River drainage (Fig. 1). Also occurs in Lake Baikal tributaries, most notably the Barguzin River drainage, the Tiya River and Yakchinskie Lakes of the upper Angara River (Knizhin et al. 2006c, 2008; Kirillov and Knizhin 2014). Also found in the Uda River drainage, Sea of Okhotsk (Antonov and Knizhin 2011) and upper Amur River drainage (Antonov and Knizhin 2011; Antonov and Mikheev 2016), including the upper Bureya River, together with T. burejensis and T. grubii, where it could be diagnosed with 100% accuracy based on morphological characters (Weiss et al. 2020b). Its morphological and genetic distinction from T. arcticus s.l. is shown in Weiss et al. (2006) (therein referred to as T. a. lenensis) and Koskinen et al. (2002) (therein referred to as T. arcticus, Lena basin). Both its body and dorsal-fin colourations are highly distinct (Knizhin et al. 2008; Knizhin and Weiss 2009) from T. arcticus s.l. as well as all other members of the genus (Dyldin et al. 2017). Thymallus baicalolenensis is reciprocally monophyletic to T. arcticus s.l. with a net divergence of 2.6% (Fig. 2).

Taxonomic validity

A species showing relatively strong reproductive isolation to four species to which it comes into contact; T. baicalensis, T. arcticus s.l., T. burejensis and T. grubii.

Conservation remarks

Very large distribution area, occupies both large rivers and headwaters including small lakes. Found in numerous relatively remote and/or pristine systems. Suggested Global Status: Least Concern.

Thymallus brevicephalus —Shorthead or Markakol grayling

Reported endemic to Lake Markakol (Dyldin et al. 2017); probably not limited to this lake. Haplotypes LC168675 and MT063035 (Fig. 2) stem from samples in the upper Irtysh River drainage; population genetic analysis shows close affinity with samples from the Kara-Kaba River (Weiss et al. 2020a) (Fig. 1). Both our mitogenome analysis and a population genetic analysis in Weiss et al. (2020a) show a very close (0.4%) sister clade relationship to T. brevirostris (Fig. 2). More data concerning morphology and ecology is needed.

Taxonomic validity

Although closely related genetically to T. brevirostris, viewed as species based on highly distinct morphology (short jaws, no dentation) and ecology (predominantly benthivore), as well as strict allopatric occurrence to T. brevirostris.

Conservation remarks

Populations within Lake Markakol are in serious decline due to overfishing (perhaps > 50% across recent decades, M. Baimukanov, pers. comm.); listed as endangered in Kazakhstan. Its strict area of aquatic occupancy (Markakol Lake) is < 700 sq. km, but its distribution is likely considerably larger. Suggested Global Status: Data deficient.

Thymallus brevirostris— Mongolian grayling

Distributed across the large endorheic basin of Western Mongolia, extending into Tuva Republic, Russia (Fig. 1). Considered a large-growing, piscivorous grayling, with pronounced dentation on both jaws and vomer (Knizhin et al. 2008). Displays considerable phenotypic variability concerning head and jaw size and dentation (Knizhin et al. 2008; Weiss et al. 2020a). Displays a close (0.4%) sister clade relationship to T. brevicephalus from Lake Markakol (Fig. 2).

Taxonomic validity

Although closely related to T. brevicephalus, viewed as a species based on highly distinct morphology (large jaws, significant dentation), ecology (predominantly piscivorous) and strict allopatric occurrence to T. brevicephalus (see also Appendix I).

Conservation remarks

Overfishing (primarily illegal), hydropower development and spawning ground deterioration are a major concern. Suggested Global Status: Vulnerable.

Thymallus nigrescens— Chovsgul grayling

Endemic to Lake Chovsgul in Mongolia. The single mitogenome (MT063028) in our analysis is 0.3% divergent from T. baicalensis (Fig. 2). Traditional genetic markers (microsatellites and single mtDNA sequences) allow no distinction between T. nigrescens and T. baicalensis (Koskinen et al. 2002; Kaus et al. 2019).

Taxonomic validity

Treated as a species despite its very close relationship to T. baicalensis. Morphologically distinct from T. baicalensis based on a small dorsal fin and a high number of gill rakers (Knizhin et al. 2008; Olson et al. 2019), occupies a distinct allopatric ecological niche.

Conservation remarks

Listed as endangered in the Mongolia Red List (Ocock et al. 2006). Its habitat comprises the 2760 sq. km Lake Chovsgul. Substantial illegal harvest via gillnetting in the littoral zone has led to dramatic declines in population sizes (Free et al. 2015). Suggested Global Status: Vulnerable.

Thymallus nikolskyi —Upper Ob grayling

Originally reported from the upper Ob River drainage; exact distribution is unclear. May occur together with T. baicalensis in the Mrassu and Kabyrza rivers of the Ob River drainage (Fig. 1). Our analysis reveals a close (0.6%) sister relationship to T. svetovidovi from the upper Enisei River drainage, and considerably more divergence from the two other taxa, T. arcticus s.l. and T. brevicephalus, in the Ob River drainage (3.0% and 1.7%, respectively). Population genetic analysis shows no gene flow between T. nikolskyi and T. brevicephalus (Weiss et al. 2020a).

Taxonomic validity

Treated tentatively as a species whereby the distributions and genetic relationships of all grayling in the Altai-Sayan mountain region require further investigation (Weiss et al. 2020a).

Conservation remarks

Insufficient data on its diagnosis and distribution. Suggested Global Status: Data deficient.

Thymallus svetovidovi— Upper Enisei grayling

Recently described from the Sharga Gol River in Mongolia (Knizhin and Weiss 2009); occurs in headwater reaches of the Enisei River in Mongolia and possibly Tuva Republic (Fig. 1). Bright yellow caudal peduncle and fin is highly characteristic; a close (0.6%) sister taxon to T. nikolskyi from the upper Ob River drainage.

Taxonomic validity

A species based on a unique phenotype and genetic divergence from T. baicalensis. See Appendix I for comments on potential synonymies.

Conservation remarks

Known distribution range is limited (but uncertain), populations are reportedly dense and the river systems where this taxon is found are pristine. Thus, there are currently no threats to this taxon. Suggested Global Status: Data deficient.