Molecular analysis reveals multiple native and alien Phoxinus species (Leusciscidae) in the Netherlands and Belgium

Fresh waters are among the most endangered ecosystems, one of the problems being the lack of data on biodiversity. In the center of the missing knowledge are cryptic species, two (or more) species classified as a single one due to their (seemingly) indistinguishable morphology. Lack of research and stabilizing selection are reflected in the cryptic diversity of the genus Phoxinus (Leusciscidae), the studies of which have intensified over the last two decades and reveal undetected taxonomic complexity. Moreover, some of the Phoxinus lineages act as invasive species, while others are endangered by their alien counterparts. Minnows have been intentionally (as food for predatory fish species) or unintentionally (with other fries) stocked causing hybridisation zones in Norway, Portugal, Spain, France, Italy, Germany and Austria. Given that genetic identity and lineage assignment of Phoxinus from Belgium and the Netherlands have not been researched, the goal of the study was to examine available samples from known localities in the area in order to infer- whether they are native or not. For this purpose, the barcoding region cytochrome oxidase I, another mitochondrial gene cytochrome b, a nuclear recombination activating gene 1 and a combination of these markers from a wider neighboring region were analyzed. The study found four different Phoxinus species/lineages occurring in Belgium and the Netherlands: P. phoxinus, P. csikii, P. septimaniae and genetic lineage 11 (possibly P.cf. morella). While the first seem to be native, the other three were probably introduced.


Introduction
Fresh waters are among the most endangered ecosystems, facing five major threats: overexploitation, pollution, flow modification including destruction of habitats, and invasive species, topped with the global scale environmental changes such as global warming, precipitation shifts and nitrogen accumulation (Dudgeon et al. 2006 and the references therein). The Abstract Fresh waters are among the most endangered ecosystems, one of the problems being the lack of data on biodiversity. In the center of the missing knowledge are cryptic species, two (or more) species classified as a single one due to their (seemingly) indistinguishable morphology. Lack of research and stabilizing selection are reflected in the cryptic diversity of the genus Phoxinus (Leusciscidae), the studies of which have intensified over the last two decades and reveal undetected taxonomic complexity. Moreover, some of the Phoxinus lineages act as invasive species, while others are endangered by their alien counterparts. Minnows have been intentionally (as food for predatory fish species) or unintentionally (with other fries) stocked causing hybridisation problem is aggravated by the lack of data as biodiversity assessment for many freshwater ecosystems is incomplete or absent. In the center of the missing knowledge are cryptic species, which are two (or more) species classified as a single one due to their (seemingly) indistinguishable morphology (Bickford et al. 2007). Cryptic species are either pseudo-cryptic-a consequence of insufficient in-depth taxonomic studies (e.g., Feulner et al. 2006) or true cryptic species in which the lack of phenotypic diversity originates from stabilizing selection (e.g., Martin and Bermingham 2000) A combination of both reasons, lack of research and stabilizing selection, is reflected in the cryptic diversity of the genus Phoxinus (Leusciscidae). The studies of the genus, which comprises ubiquitous small minnows inhabiting diverse waterbodies (Frost 1943), have intensified over the last two decades, and it has become clear that it harbours previously undetected taxonomic complexity (Kottelat 2007;Palandačić et al. 2017). Currently there are twenty-three recognized mitochondrial genetic lineages, with data for several major river drainages still missing (Palandačić et al. 2020;Denys et al. 2020). Thirteen of the Phoxinus lineages represent valid species, while three additional lineages and sub-lineages have available names but their species status is not yet confirmed (lineage 9d-P. cf. carpathicus, lineage 11-P. cf. morella, lineage 20-P. cf. chrysoprasius). It is also worth noting that most of the genetic analyses in the genus Phoxinus were based on the analysis of mitochondrial genes, while nuclear data supporting species delimitations are inconclusive or still missing (Palandačić et al. 2020).
Besides cryptic diversity and problems with species delimitation (Palandačić et al. 2017(Palandačić et al. , 2020, the conservation efforts of the genus Phoxinus are also compounded by some of the lineages acting as invasive species, while others are endangered by their non-native counterparts or due to one of the threats described above, e.g., global warming (Závorka et al. 2020). Minnows have been introduced to south-west Norway, the Spanish Pyrenees and Douro drainage in Portugal, probably by live-bait fishing (Museth et al. 2007;Corral-Lou et al. 2019;Garcia-Raventós et al. 2020). Besides, Phoxinus have also been intentionally (as food for predatory fish) or unintentionally (with other fries) stocked causing hybridisation zones in Spain, France, Italy, Germany and Austria (Corral-Lou et al. 2019;Denys et al. 2020;De Santis et al. 2021;Knebelsberger et al. 2015;Palandačić et al. 2020). However, in the Western Balkans the hybridisations seem to be natural, and occur as a consequence of historical and recent migration patterns of the minnows (Palandačić et al. 2020). The probability of a lineage being introduced vs natural migration was (descriptively) assessed using historical museum samples (Knebelsberger et al. 2015;Palandačić et al. 2020), but these are not always available. Another possible method is to trace similar haplotypes: if the minnows have migrated naturally, it is anticipated that they are most closely related to neighboring populations (Garcia-Raventós et al. 2020). However, for this purpose, a dense sampling and analysis of comparable molecular region (such as cytochrome oxidase I-the barcoding fragment-COI) is needed throughout Europe. In addition, fish introductions from the neighboring areas have also been known to take place (e.g., Snoj et al. 2021).
In the Netherlands, historical data from fish hatcheries show that fish introductions (e.g., salmon, trout, whitefish, shad, carp, pikeperch) have started in 1879 (Groot 1985) with stocks originating from Germany, Denmark, France, United States, Russia, Poland and Austria. There is no mentioning of Phoxinus minnows, but they are generally considered as a suitable food source in hatcheries for salmon in the Netherlands (Mulier 1900). In addition, minnows are bred as live baits and thus it is possible that some have been released into new habitats (Thaulow et al. 2014). In the surrounding rivers, in the Rhine and Meuse drainages on the territory of France and Germany, Phoxinus phoxinus is believed to be native, while several additional lineages/species detected in this area are suggested to be introduced (Knebelsberger et al. 2015;Denys et al. 2020), with Knebelsberger et al. (2015) mentioning that the native Phoxinus populations are possibly threatened by minnows from the Danube (P. csikii-mitochondrial genetic lineage 5b according to Palandačić et al. 2020). Besides P. csikii, P. septimaniae (lineage 12) from the Rhone drainage was detected (Knebelsberger et al. 2015). Eastwards, the distribution area of lineage 11 (as specified in Palandačić et al. 2020; not confirmed as a species yet but the name P. morella is available species name) begins and encompasses Wesser, Ems, Oder and Elba drainages (Knebelsberger et al. 2015;Rothe et al. 2019;Palandačić et al. 2020).
Given that genetic identity and lineage assignment of Phoxinus from Belgium and the Netherlands have not been researched, the goal of the study was to examine available samples from known localities in the area in order to infer whether they are native or not. For this purpose, the barcoding region COI, another mitochondrial gene cytochrome b (cytb), the nuclear recombination activating gene 1 (RAG1) and a combination of these markers from a wider neighboring region were analyzed.

Materials and methods
In this study, the coding follows mitochondrial genetic lineages as specified in Palandačić et al. (2020) and where available, species names are given in brackets. In addition, two new Phoxinus species (lineage 16 -P. fayollarum and 23 -P. dragarum) from Denys et al. (2020) are also included. Lineage 21 (sensu Palandačić et al. 2020) has no COI data available, as the whole analysis was performed based on cytb (Corral-Lou et al. 2019), thus it cannot be included in the phylogenetic analyses. However, Denys et al. (2020) speculates that lineages 21 and 23 belong to one and the same species, P. dragarum. The lineage 11 could correspond to the available name P. morella, which was suggested in Palandačić et al. 2017, however it was not confirmed as a species yet, thus it is denoted as P. cf. morella. Finally, the species name P. phoxinus is connected to the mitochondrial lineage 10 based on historical data for this area represented by Knebelsberger et al. (2015) and adopted by Palandačić et al. (2017). Nevertheless, the neotype designated by Kottelat (2007), which comes from an area with several hybridising lineages (5b, 10, 12), was never genetically analyzed. Thus, the connection to the name P. phoxinus with the lineage 10 remains uncertain.
The specimens were caught in the period between November 2017 and April 2018 with hand nets (under Animal Experimentation Law: WoD Licence No. TRC/VWA/2012/4275) and either a small piece of the fin or a skin swab were sampled. Altogether forty-nine new specimens from five sampling sites in Belgium, the Netherlands (Table 1, Fig. 1) and a reference sample from neighboring area in Germany were newly analyzed for this study.
Mitochondrial DNA DNA was extracted with QIAmp DNeasy Blood and Tissue Kit (QIAGEN, Germany) following manufacturer´s protocol. Amplification of COI and cytb was conducted as described in Palandačić et al. (2017). Succeeding polymerase chain reaction (PCR), sequencing was performed in both directions at Microsynth (Vienna, Austria). Raw sequences were visually inspected, if necessary reverse complemented, aligned with MEGA 6.0 (Tamura et al. 2011) and trimmed.
Due to several barcoding projects (e.g., Knebelsberger et al. 2015) COI sequences of Phoxinus across Europe became available for comparison, consequently other molecular projects have followed the trend (Palandačić et al. 2017(Palandačić et al. , 2020Denys et al. 2020). Thus, for phylogenetic tree reconstruction, only COI sequences, which were available for all but one lineage (lineage 21, Corral-Lou et al. 2019), were used.
The sequences were collapsed to unique haplotypes with FaBox v1.5 (Villesen 2007) and the most fitting evolutionary model was selected using ModelFinder (Kalyaanamoorthy et al. 2017) applying the BIC criterion. The tree was calculated with . First 25% of the trees was discarded as burn-in and the rest were used to build a 50% majority rule consensus tree. An unrooted minimum-spanning network was built using COI and the reference sequences from four genetic lineages, to which the specimens clustered according to the phylogenetic tree. The network was built using the median-joining algorithm implemented in Network 5.1 (www. fluxus-engin eering. com) with default settings. A minimumspanning network with COI-cytb concatenated sequences was also constructed, including the specimens, for which both COI and cytb sequences (Table S1) were available in the GenBank.
Genetic diversity indices (haplotype diversity (Hd), nucleotide diversity (π), number of polymorphic sites (S) and mean number of pairwise differences (k)) were calculated with DnaSP v6 (Rozas et al. 2017) for each genetic lineage (including the reference samples from the GenBank) using COI sequences.  (1), Germany (1) and the Netherlands (3), and the current distribution of Phoxinus species in the wider surrounding area. Mitochon-drial genetic lineages (according to Palandačić et al. 2020) are given in brackets. Major river drainages are shown Nuclear DNA From each population, RAG1 gene from a few randomly chosen individuals was amplified and sequenced. The sequences were inspected by eye and trimmed. The gametic phase of the heterozygous individuals was determined using Phase 2.1 (Stephens et al. 2001;Stephens and Scheet 2005), implemented in DnaSP v6 (Rozas et al. 2017). Finally, an unrooted minimum-spanning network was constructed with the median-joining algorithm implemented in Network 5.1 (www. fluxus-engin eering. com) with default settings.

Mitochondrial DNA
Sequencing resulted in 49 new cytb and COI sequences, which were 1091 and 651 base pair long, respectively.
For the phylogenetic analysis, 1044 additional COI sequences from the GenBank (Table S1) were used. The alignment was trimmed to 634 bp to match the length of the shorter GenBank sequences and collapsed to 247 unique haplotypes. The resulting phylogenetic tree (Fig. 2) exhibited good support for the genetic lineages, while the deeper nodes remained unsupported. The newly analyzed specimens from sampling sites DON, MEE, MOEL, VLO clustered mostly to genetic lineage 10 (P. phoxinus) and from sampling site EPE to lineage 11 (P. cf. morella). However, one specimen from MEE and one from MOEL clustered to 5b lineage (P. csikii) and one specimen from the sampling site MOEL clustered to genetic lineage 12 (P. septimaniae).
The COI haplotype network and the pie charts representing the number of analyzed specimens and their genetic lineages are represented in the Fig. 3. As previously shown in the phylogenetic tree, most of the specimens from DON, MEE, MOEL and VLO clustered to the lineage 10 (P. phoxinus; Fig. 3c), where they share the most common haplotype with the specimens collected from rivers Agger, Sieg, Ruhr and Rhine in Germany as well as with the specimens collected in the tributaries to Meuse in France. However, in all sampling sites (DON, MEE, MOEL, VLO) also unique haplotypes were detected, which were previously not known for the lineage 10.
The MOEL specimen, which clustered to the lineage 5b, exhibited the most common haplotype, detected also in rivers Sieg and Main (Rhine) in Germany, as well as in L´Arve and Boiron River (Rhone) in France. The same haplotype was present in Italian Alpine lakes (Po River watershed). The 5b haplotype detected in the MEE sample was unique and one mutational step distant form the most common 5b haplotype (Fig. 3c).
The specimens from EPE clustered to the lineage 11 (P. cf. morella). Three of the specimens formed their own haplogroup, while one specimen represented a unique haplotype. Other six clustered to the most common haplotype, which included specimens collected in the tributary to Wesser in Germany, as well as Morava (Danube) and Ohre (Elbe) in Czech Republic.
Finally, in the sampling site MOEL, one specimen clustered to genetic lineage 12 (P. septimaniae). This haplotype was unique and was not detected before, but it was 2-3 mutational steps distant to haplotypes detected in rivers L´Arve and Usses in France (Rhone), as well as in the Italian Alpine Lakes.
The haplotype network constructed with concatenated COI-cytb sequences is available in the supplementary material ( Figure S1). It supported the results of the COI haplotype network.
Genetic diversity indices for genetic lineages are given in Table 2. Lineage 10 (P. phoxinus) had the lowest haplotype diversity, while lineage 11 (P. cf. morella) had the lowest nucleotide diversity. Both indices (Hd and π) were the highest in the lineage 12 (P. septimaniae).

Nuclear DNA
Fourteen specimens from sampling sites DON (3 specimens), EPE (4 specimens), MEE (4 specimens) and VLO (3 specimens) were sequenced and analyzed for the RAG1 gene. The amplification and sequencing of MOEL sample failed. Additionally, 406 RAG1 sequences from the GenBank (Table S1) were added to the alignment, which was trimmed to 901 bp. The gametic phase of the combined dataset of 420 RAG1 sequences was determined and resulted in 840 alleles.
The calculated network ( Figure S2) did not present a clear grouping of the specimens, whether according to their mitochondrial genetic lineage or based on geography. The samples from this study (marked black) did roughly form two groups, on the opposite sides of the network, one by EPE sample and the other by VLO, DON and MEE, of which one DON, one MEE and one VLO shared the same allele with specimens from Auelsbach creek, Agger (Rhine) and Seymaz River, L′Arve (Rhone). However, some alleles (DON, EPE, MEE and VLO) were scattered across the network. The only other representative of the genetic lineage 11 (P. cf. morella) originating from other studies (GenBank MG806195, Vazovecky creek, Jizera River, Elbe), did not share alleles with EPE sample and was four mutational steps distant from the closest EPE allele. This specimen also did not share COI haplotype with any of the EPE specimens (MG806861, Fig. 3c).

Discussion
According to mitochondrial lineages identified in this study, the Belgium and Netherlands minnows, sampled in the rivers Geul (MEE), Berwinne (MOEL) and Roer (VLO), which are all tributaries to Meuse (North Sea), belong to the species P. phoxinus (lineage 10). The finding is in line with the known distribution of P. phoxinus species, which extends over the Seine, Meuse and Rhine watersheds (Knebelsberger et al. 2015;Denys et al. 2020) and was, based on the analysis of the museum specimens, proposed as the original/historical lineage in the area by Knebelsberger et al. (2015). However, in the sampling sites MEE and MOEL, mitochondrial lineages characteristic for P. csikii (lineage 5b) and P. septimaniae (lineage 12) were also found, pointing to human mediated introductions/invasions; especially as the same was observed in the neighboring rivers (Knebelsberger et al. 2015;Denys et al. 2020; Fig. 1). Furthermore, the proportions of specimens carrying P. csikii (2 specimens, one from each locality) and P. septimaniae Fig. 3 a and b represent the pie charts with the number of analyzed specimens per sampling site and their genetic lineages (close up-(a), broader area-(b)). Figure 3c represents the haplotype network based on cytochrome oxidase I (COI) fragment and four genetic lineages (legend), which were detected in the newly collected samples (DON, EPE, MEE, MOEL, VLO) in this study together with the reference sequences from the GenBank. The lines carry the number of mutations (shown in red) where more than one (one specimen from MOEL) mitochondrial lineages are in line with the ones recorded by Knebelsberger et al. (2015) in the Sieg/Agger/Rhine catchments, where the numbers are favoring P. phoxinus. The prevailing of P. phoxinus haplotypes suggests recent introductions of P. csikii and P. septimaniae rather than a past water connection facilitating the exchange between the populations, such as between Danube and Rhine some fifteen thousand years ago (Leuven et al. 2009). Nevertheless, in some sampling sites, P. csikii haplotypes seem to be more common (e.g., Windeck and Netphen, both on the Sieg River (Rhine, North Sea), Fig. 3) pointing to a strong invasion if not a historical/natural occurrence at these sites. Besides the possible dispersion through numerous artificial canals in this area, minnows in the Geul (Meuse, North Sea) could also originate from abundant fish hatcheries rearing Salmonidae, with especially intensive introductions in the 1960's, attempting to repopulate the area with Salmo trutta (Crombaghs et al. 2000). The origins of these stocks and whether they also contained Phoxinus is unknown. However, these practices are likely to be the source of P. septimaniae and P. csikii in this region nowadays (see also the section on lineage 11-P.cf. morella below).
Surprisingly, all the specimens sampled in the Agger River (DON) clustered to P. phoxinus, which is in contrast to the detected genetic admixture in the neighboring sampling sites from Agger and Sieg (Knebelsberger et al. 2015). The same is true for VLO population (Roer River), which was rediscovered in 2002, likely a consequence of a colonization from upstream German parts of this river (Schaik and Gubbels 2003). However, the admixture still might be present, albeit, due to low number of analyzed specimens, not detected by the current study.
The specimens collected at Verloren Beek (EPE), a tributary to Rhine, carry mitochondrial lineage 11 (P. cf. morella, not a valid species, Palandačić et al. 2020). According to the poster from Rothe et al. (2019), the closest this lineage is found in Germany is near Osnabruck in the Ems watershed (Fig. 1); however, beyond the presented poster, their data is not published, thus no sequences are available for comparison. The origin of the Phoxinus in the Verloren Beek (EPE, Rhine catchment) is not clear; it has been recorded here for a century and the stream seems to possess suitable ecological conditions, exhibiting other usual rheophillic species such as lamprey and bullhead. Moreover, the haplotype, which was detected at the closest location in Wesser River, was also present at the EPE sampling site (3 specimens). It was shown previously that the distribution of Phoxinus does not follow watershed boundaries (Palandačić et al. 2017(Palandačić et al. , 2020, thus lineage 11 could be distributed in the Verloren Beek naturally. However, it is likely that Phoxinus in the Verloren Beek originated from fish hatcheries rearing Salmonidae from the late nineteenth century till present in this region. Trout were imported from countries like Denmark and were reared in streams such as Verloren Beek (Anomymous, 1929). Phoxinus minnows were recommended as a food source (Mulier 1900) and were actually present in at least one known hatchery (Peusens 1915). Escapes are likely to have taken place. The earliest known Phoxinus distribution data from this region (1915) (NDFF, 2021) also correspond both in place and time with the activities in the hatcheries.
The results of the nuclear RAG1 gene analysis are inconclusive, yet some of the specimens from DON, VLO and MEE do roughly form their own group on one side ( Figure S2), while some can be found scattered across the network, pointing to hybrid individuals. The RAG1 gene has previously been criticized for its usefulness of detecting admixture in Phoxinus (Palandačić et al. 2020) and the same pitfalls can be observed here. Thus, the detected genetic pattern can be a consequence of incomplete lineage sorting, problems with the gametic phase determination or lack of genetic variability rather than a sign of admixture. Consequently, the results here, as in many previous studies, are based mostly on the COI region and additional nuclear markers as well as a greater number of sampling sites and specimens could provide a more detailed insight into the origins of the Phoxinus minnows in the Netherlands and Belgium.
The current distribution of freshwater organisms in Europe and their phylogeographic lineages were shaped by the last Pleistocene glacial period and the ensuing climate warming (Hewitt 2004). In the literature, several potential ice age refugia have been documented, with the three main south peninsulas (the Iberian, the Apennine and the Balkan Peninsula) playing the central role. Correspondingly, it is commonly believed that the cold temperature and the reduced water availability forced most of the local fishes to retreat into isolated southern ice-age refugia (Csapó et al 2020). However, Phoxinus minnows can survive adverse cold climatic conditions, including severe winters under ice cover, and are very adaptable to various stream and lake habitats as well as plastic with regard to dietary preferences (e.g., Frost 1943). For these reasons, in general, the location of Phoxinus glacial refugia, timing of lineage diversification, and dynamics of post-glacial colonization are hard to predict on either environmental constraint to geographic expansion or strict ecological requirements of the genus alone. Nevertheless, in Phoxinus, eleven different lineages/species occupying the Balkan peninsula were detected (Palandačić et al. 2017), which show higher levels of allopatric differentiation, and a complex genetic structure with several sub-lineages occupying a narrow geographic area (e.g., in P. lumaireul). Similarly, southern lineage/species P. septimaniae exhibit high level of haplotype diversity (0,91) in contrast to northern lineage/species P. phoxinus (0,39). While P. septimaniae also displays a relatively high nucleotide diversity (0,63%), a northly distributed lineage 11 (P. cf. morella) has a haplotype diversity of 0.74, with the nucleotide diversity the lowest of the four tested lineages (0.18%), pointing to a fast dispersal following a period of a small population size (Grant and Bowen 1998). A current rapid population spread is consistent with lineage 11 dispersing from a refugia after the last glaciation (e.g., Pannonian). Nevertheless, the herein presented phylogenetic tree (Fig. 2) as well as the trees presented in other publications (Palandačić et al. 2017(Palandačić et al. , 2020 show low statistical support for the deep nodes. This makes even approximate timing of the splits between the lineages/species unreliable (Corral-Lou et al. 2019). The pattern of the southern lineages showing a greater genetic diversity reflects longer periods of isolation in comparison with those from northern Europe that are less divergent and represented by fewer genetic clusters, a phenomenon, which has been previously discussed in literature (e.g., Csapó et al. 2020). Nevertheless, due to the complexity of the present and past colonization routes of Phoxinus (Palandačić et al. 2020;Garcia-Raventós et al. 2020) further analysis beyond two mitochondrial and one nuclear gene is needed to draw final conclusions on the origin of Phoxinus species in Belgium, the Netherlands and Europe in general.