Genetic diversity and relatedness in aquaculture and marina populations of the invasive tunicate Didemnum vexillum in the British Isles

Introductions of invasive, non-native species in the marine environment are increasing as human activity within coastal areas rises. Genetic datasets are useful tools to identify source populations, track routes of invasions, and illuminate the role of genetic variation in the establishment and subsequent spread of novel introductions. Here, a microsatellite dataset is used to estimate the genetic diversity and population structure of 7 introduced Didemnum vexillum populations in Britain and Ireland, 4 of which are associated with aquaculture and 3 with marinas. Genetic differentiation observed between these populations indicates human-mediated transport as the main mechanism underlying the population structure of D. vexillum in Britain and Ireland. In addition to elucidating patterns of population structure we found that aquaculture sites showed significantly higher genetic diversity (measured as allelic richness) in comparison to the marina sites. We discuss these findings in relation to the history of each invasion, the complex life history of D. vexillum, and available evidence of the relative invasiveness of these populations. Our results show numerous interesting patterns which highlight further research avenues to elucidate the complex factors underlying the global spread of this successful invader.


Introduction
Examining patterns of genetic diversity in invasive non-native species (INNS) can provide important mechanistic insights into the pathways of invasion and assist future management (Wellband et al. 2017). Historically, newly introduced populations of INNS were expected to exhibit low levels of genetic variation due to genetic drift, a result of population bottlenecks and founder events following transport from their native range (Dlugosch and Parker 2008;Crawford and Whitney 2010). Despite these expectations, many successful introductions of INNS have retained high genetic diversity (Lavergne and Molofsky 2007;Crawford and Whitney 2010;Wellband et al. 2017), likely the result of high propagule pressure (Simberloff 2009), multiple introduction events (e.g., Kolbe et al. 2004;Dlugosch and Parker 2008), and/or stratified dispersal (Darling and Folino-Rorem 2009;Tobin and Blackburn 2008;Berthouly-Salazar et al. 2013). Rius et al. (2015), report clear evidence of reduced genetic diversity in only 23% studies of introduced marine populations, with most studies (74%) reporting no change in diversity between introduced and native populations. In addition to variability in genetic characteristics, interspecific comparisons of INNS suggests that there is a high degree of variability in the rate of range expansions following new invasions, with some populations expanding rapidly (O'Neill and Dextrase 1994), and others experiencing significant lag phases before expansion (Aikio et al. 2010). Further, not all populations of an INNS have equal potential for becoming invasive (Allendorf and Lundquist 2003), and instances of marine INNS with both invasive and non-invasive populations have been reported (e.g., Osman and Whitlatch 2007).
The carpet sea squirt (Didemnum vexillum) is a colonial ascidian native to Japanese waters ) that has invaded temperate regions globally, likely via the movement of ships with fouled hulls, and/or as epifaunal growth on cultured Pacific oysters. D. vexillum is an invasive species of particular concern due to its ability to significantly alter the structure, biodiversity and function of ecosystems and communities (Mckenzie et al. 2017). Despite reductions of genetic diversity in comparison to native populations in both mitochondrial (Smith et al. , 2015Ordóñez et al. 2015) and nuclear (Casso et al. 2019a) markers, introduced populations from a single clade of D. vexillum have successfully colonised a variety of environments Smith et al. 2015). Across introduced populations, colonies vary in size (Coutts 2002;Valentine et al. 2007), morphology (Coutts and Forrest 2007;Lambert 2009), and in their extent of spread from artificial structures where they were originally introduced to natural habitats. Dispersal within regions where D. vexillum has been introduced is likely a combination of both natural and human-mediated dispersal. Shortdistance dispersal likely occurs frequently via larvae (D. vexillum has a short-lived larval duration lasting from 2-36 h; Fletcher et al. 2013), while sporadic long-distance movements can occur both naturally through fragmentation and reattachment of colonies (Morris and Carman 2012) and/or rafting of colonies attached to floating materials (Worcester 1994) and via human-mediated mechanisms.
Increasing economic activity and vessel movement within and around the Irish Sea make the continued spread of D. vexillum a significant threat to both British and Irish ecosystems and economies. Given the complex interaction between natural and humanmediated dispersal, and its range of observed phenotypes, there is still much to learn about D. vexillum introductions, and genetic tools remain invaluable resources to address these questions. Here we estimate genetic differentiation of D. vexillum populations around the coasts of Britain and Ireland. We hypothesize that human-mediated dispersal is most likely to drive the genetic structure between populations and predict that sites connected by human activity (e.g., marina sites connected by vessel movement or aquaculture sites linked by transfer of contaminated seed stock) will be more closely related, regardless of geographic distance. We also estimate genetic diversity for each population, and discuss these results in light of qualitative evidence of invasiveness, and the complex life-history of D. vexillum. If the variable invasiveness of D. vexillum introductions can be linked to specific genetic characteristics (i.e. discrete genetic clusters or patterns in diversity), the management of both existing and novel introductions can be prioritized to those with greater risks of impact.

Study area
The coastlines of Britain and Ireland have experienced multiple introductions of Didemnum vexillum since it was first recorded in Ireland in 2005 (Malahide Marina; Sides 2006) andin Britain in 2008 (Holyhead marina, Wales;Griffith et al. 2009). The current distribution of D. vexillum includes numerous introductions concentrated in areas supporting human activity (oyster aquaculture farms and marinas). Here, we sampled from 7 geographically isolated introduced populations within England, Ireland and Wales (Fig. 1, Table 1). We also qualitatively described each population (colonization history, morphology and habitats colonized) by combining evidence from peer-reviewed and grey literature sources alongside anecdotal evidence from industry and government professionals and our own personal observations (Online Resource 1).

Sample collection and DNA extraction
We sampled a total of 419 D. vexillum colonies from seven locations in the summer months between July 2017 and September 2019. Clew Bay, Galway Bay, Dunmanus Bay and Kent are sheltered sites in which oyster aquaculture takes place on intertidal trestles. The sites at Malahide and Holyhead are marinas containing boats and other floating structures (buoys, pontoons, etc.). Strangford Lough supports both oyster aquaculture and marina infrastructure both of which have been colonised by D. vexillum. It is unknown whether the D. vexillum population in the marina that was sampled for this study spread from aquaculture sites or originated from a separate introduction directly to the marina. See Online Resource 2 for detailed information on sampling methods.
Total genomic DNA was extracted from tissue sections using a Qiagen DNeasy Blood & Tissue Kit, according to the manufacturer's protocol. As D. vexillum colonies have been reported to fuse and form chimeric colonies at the fusion interface (Fidler et al. 2018;Casso et al. 2019b), DNA was extracted from a small section of tissue (\ 100 mg) to minimize the probability of capturing a fusion interface.

Genetic profiling of samples
Before microsatellite profiling, we sequenced an * 600 bp region of the COI gene in a subset of 71 randomly chosen samples across all populations (between 3 and 22 samples/population) to confirm species ID using the tunicate-specific primers of Stefaniak et al. (2009). As this is the first known report of D. vexillum in Dunmanus Bay, we sequenced all samples collected from this site at the COI gene, in addition to an * 480 bp fragment of the 18S rRNA gene using the primers designed by Price et al. (2005).
Twenty microsatellite loci from published literature were initially assessed for use in this study (Abbott et al. 2011;Fidler et al. 2018;Watts et al. 2019), from which we selected 7 polymorphic loci (Dvex03, Dvex10, Dvex11, Dvex18, Dvex19, Dvex20 and Dvex42) that could be amplified reliably and scored consistently across all samples. See Online Resource 2 for detailed information on genetic profiling methods. The R (R Core Team 2018) package adegenet version 2.0.1 (Jombart 2008) was used to estimate observed (H O ) and expected (H E ) heterozygosity for each locus, and to conduct t-tests to evaluate whether average H O and H E were significantly different across loci for each population. This package was also used to conduct Bonferroni-corrected (a = 0.05) chi-squared tests to detect loci that departed from Hardy-Weinberg Equilibrium (HWE). GENEPOP (version 4.7; Raymond and Rousset 1995;Rousset 2008) was used to implement Fisher's exact tests for linkage disequilibrium (LD) for all locus pairs within each population. HP-RARE version 1.1 (Kalinowski 2005) was used to estimate average allelic richness (AR) and private allelic richness for each population, which were corrected for our lowest sample size (N = 15) using rarefaction. Lastly, the R package diversity version 1.9.90 (Keenan et al. 2013) was used to generate multiple estimates of both global and pairwise genetic differentiation (F ST [Weir and Cockerham 1984]; G ST [Nei 1973]; and Jost's D [Jost 2008]) in addition to the As D. vexillum can reproduce asexually sample sizes reflect the total number of samples collected (N; in brackets is the number of samples collected from artificial substrates/and natural substrates), the number of samples that were identified as clones based on identical genotype profiles (N CLONES ; in brackets is the proportion of samples that were clones), the number of samples that were identified as putative chimeric colonies based on the presence of [ 2 alleles at least 1 locus (N CHIMERA ; in brackets is the proportion of samples that were putative chimeras) and the number of genetically unique, non-chimeric samples used in data analysis (N ANALYZED ) inbreeding coefficient, F IS , with 95% confidence intervals on each estimate (999 bootstraps).
To determine whether there were significant differences in genetic diversity between populations from different habitats, we conducted comparisons of genetic diversity metrics (H O , AR, gene diversity [H S ] and F IS ) among populations grouped by habitat type (aquaculture vs. marina) using the software FSTAT v2.9.4 (Goudet 2003). For this analysis we employed two-tailed tests using 10,000 permutations to test for significance. To examine the distribution of genetic variation within and among samples in relation to both population and habitat type, we conducted an analysis of molecular variance (AMOVA) in Arlequin v3.5.2.2 (Excoffier and Lischer 2010) and evaluated the significance of differentiation between and within groups using 10,000 permutations. As the Kent population may have been originally introduced via aquaculture but since spread to other natural and artificial environments, we conducted 2 tests for both of the analyses described above: (1) including Kent in the aquaculture group, and (2) excluding Kent from the analysis.
In addition to estimates of genetic differentiation (F ST , G ST , Jost's D), the Bayesian clustering software STRUCTURE version 2.3.4 (Pritchard et al. 2000) was used to test scenarios of K = 1-7, running 10 iterations for each value of K and using a burn-in period of 500,000 followed by 1 9 10 6 Markov chain Monte Carlo iterations. An admixture model was specified, as secondary introductions or movement of vessels/aquaculture equipment between sites may have facilitated admixture between populations. As STRUCTURE has been shown to incorrectly estimate K when sampling is unbalanced (Wang 2017), the independent allele frequency model was selected, the alternative ancestry prior was applied (allowing for unequal representation of source populations in each sampled population), and the default value of ALPHA was changed to 1/K (0.14). The Evanno method (Evanno et al. 2005) was implemented in STRUC-TURE HARVESTER (Earl and vonHoldt 2012) to select the most likely number of genetic clusters, and the 10 replicate runs for the best supported K were summarized with the software CLUMPP (Jakobsson and Rosenberg 2007). The results were visualised with DISTRUCT (Rosenberg 2004). Due to the reliance of STRUCTURE on the underlying assumptions of population genetics models, population structure was also assessed with a multivariate analysis, Discriminant Analysis of Principal Components (DAPC), implemented in the R package adegenet version 2.0.1 (Jombart 2008). As DAPC does not rely on the assumption of mutation-drift-equilibrium, it has been suggested to be appropriate for invasive species (Wellband et al. 2017). It has also been shown to be a reliable method for detecting genetic clines or clusters that were not detected with STRUCTURE in weakly differentiated populations (Jombart et al. 2010;Kanno et al. 2011).

Genetic diversity and population structure
Of our sample subset sequenced at the COI gene (N = 71), all were identified as D. vexillum with 99-100% identity matches to at least 1 D. vexillum sequence in the NCBI GenBank database. At the 18S rRNA gene, all Dunmanus Bay samples were identified as D. vexillum with 99-100% identity matches to at least 1 D. vexillum sequence in the GenBank database. We trimmed sequences to remove base calls of low quality and submitted all sequences to GenBank (Accession Numbers COI: MW425612-MW425681; 18S: MW415990-MW416011).
We observed a total of 6.9% clones in our dataset (range 0-18.5%; Table 1), all of which were collected from the same local area within each site. We also observed very few putative chimeras in our samples (2.7% of all samples) across sites (Table 1). The number of alleles per locus ranged between 3 and 14 within our dataset. See Online Resource 3 for a summary of H O /H E , HWE, LD, and F IS results. Allelic richness (AR) was lower in marina sites (average AR = 3.15) compared to aquaculture sites (average AR = 4.64; Table 2). Sites with the highest AR were also those that showed the highest numbers of private alleles ( Table 2). Comparisons of genetic diversity metrics (H O , AR, gene diversity and F IS ) for populations grouped by habitat revealed a significant difference in AR between aquaculture and marina habitats, regardless of whether Kent was included (p = 0.025) or excluded (p = 0.025) from the analysis. All other tests were insignificant (p [ 0.05).
Estimates of genetic differentiation showed largely similar patterns for each of the metrics measured ( Bay. The most highly differentiated sites were marina sites, with Strangford Lough being the most differentiated across all sites. While Holyhead Marina was closely related to Malahide Marina, it was also highly differentiated from all other sites. Of the aquaculture sites, Galway Bay was most highly differentiated, with the lowest differentiation between Clew Bay and Dunmanus Bay. Our AMOVA analysis revealed significant partitions in genetic variation among habitat groups, among populations within habitat groups, among individuals within populations and within individuals, the latter explaining the largest percentage of variation in our dataset (Online Resource 5). These significant differences were retained regardless of whether we included Kent in the analysis, however we note that the difference among habitat groups becomes less significant when Kent is included (p \ 0.001 with Kent removed vs. p \ 0.01 with Kent included). Both Evanno's Delta K and Ln(K) plots provided strong support for 3 genetic clusters using our set of 7 presumably neutral microsatellite markers (Online Resource 6). These clusters corresponded to groupings of (1) marina (Holyhead and Malahide Marinas) and (2) aquaculture (Galway Bay) sites, with (3) Kent and Strangford Lough forming a third, genetically distinct cluster (Fig. 2). The aquaculture sites Clew Bay and Dunmanus Bay showed admixture between genetic clusters 2 and 3. The results of the DAPC analysis mirrored the STRUCTURE results closely (Fig. 3).

Discussion
Estimates of genetic relatedness among D. vexillum populations show a complex picture, with a combination of effects related to regional proximity and habitat type. Firstly, AMOVA results suggest significant partitions in genetic variation among habitat types (i.e., marinas vs. aquaculture), even though these groupings account for the smallest proportion of variation overall. Additionally, while we show genetic connectivity between some spatially clustered sites, there are notable exceptions of long-distance connectivity, providing evidence for potential source populations and routes of invasion within Britain and Ireland. For example, high genetic connectivity was detected between Malahide and Holyhead marinas, situated on opposite sides of the Irish Sea with substantial boat traffic between them. In contrast, the third marina site in Strangford Lough, on the east coast  (Fletcher et al. 2013), the patterns of genetic differentiation and population structure reported here (Figs. 2, 3, Online Resource 4) are likely to reflect human-mediated dispersal rather than natural dispersal as the primary mode of connectivity between populations throughout Britain and Ireland. This finding agrees with a growing body of literature attributing aquaculture practices and maritime trade to the introduction and spread of INNS globally (e.g., Voisin et al. 2005;de Barros et al. 2009;Blakeslee et al. 2010;Meistertzheim et al. 2013;Cruz Capel et al. 2017). At the local level, natural and human-mediated dispersal are more difficult to disentangle. The presence of individuals from STRUCTURE cluster 2 in all three aquaculture-associated populations on the west coast of Ireland (Galway Bay, Clew Bay and Dunmanus Bay) suggests a common source. However, it is not clear whether this has arisen due to natural dispersal from a single introduction or separate human-mediated introductions (i.e., the use of oyster seed from a common contaminated source at all three sites). The capabilities of D. vexillum colonies to fragment and re-attach to new substrates supports the potential for natural dispersal and connectivity between proximate populations (Bullard et al. 2007;   Reinhardt et al. 2012) and as fragments can be viable in the water column for up to 3 weeks (Morris and Carman 2012), dispersal of D. vexillum via drifting or rafting may promote connectivity at a regional scale (i.e., among aquaculture sites). In contrast, individuals assigned to STRUCTURE cluster 3 are common at Clew Bay and Dunmanus Bay but very rare at Galway Bay, despite the fact that Galway Bay lies between the other two populations. This suggests separate introductions facilitated by human-mediated transportation, or divergence between sites following a single introduction due to a lack of gene flow, and refutes the likelihood of natural dispersal between sites. The inclusion of additional invasive (European) and native (Japanese) populations outside our study area would help to resolve the complex relationships between these populations (e.g. as in Casso et al. 2019a), and importantly allow for the identification of source populations.
In addition to enhancing our understanding of the patterns of connectivity among populations, our work also provides insight into the potential role of environment and/or genetic diversity in determining invasiveness. Qualitative measures of invasiveness were higher in populations of D. vexillum at aquaculture sites, and these populations also had greater allelic richness. Local environment can have an important role in determining the likelihood of INNS becoming invasive (e.g., Alpert et al. 2000;Burns 2006). For example, in the relatively open coastal habitats used for oyster aquaculture, D. vexillum may experience greater competitive advantage (Osman and Whitlatch 2007) and there is greater opportunity for natural dispersal of larvae and therefore local spread. However, marina environments, despite being more confined, may provide superior conditions for growth (e.g., protection from air exposure/desiccation, sunlight, sedimentation, and strong wave action; Daniel and Therriault 2007 and references therein), and opportunities for human-mediated dispersal.
An alternative explanation for the observed variation in invasiveness is the potential for a relationship between genetic diversity and invasive success. Lower genetic diversity can lead to the reduced success of invasive populations via both neutral and adaptive processes that vary temporally. From a neutral perspective, low genetic diversity may be indicative of introductions with low propagule pressure, which can negatively affect invasion success via both genetic and demographic processes (Hufbauer et al. 2013;Blackburn et al. 2015;Bock et al. 2015). For example, populations with lower genetic diversity are more likely to suffer the negative effects of inbreeding depression, which is likely to affect populations at early stages of introduction when population sizes are small (Charlesworth and Charlesworth 1987). Additionally, populations with high genetic diversity have a higher probability of containing genotypes that facilitate establishment success (e.g. Gamfeldt et al. 2005), or resilience to disturbance or novel environments (e.g. Reusch et al. 2005;Crawford and Whitney 2010). Over longer time-scales, admixture may play a large role in determining invasive success (e.g., Keller and Taylor 2010;Hudson et al. 2020), by increasing genetic variation for evolution, creating novel phenotypes via new allele and gene combinations, facilitating heterosis, and masking or purging deleterious mutations which may counteract the initial negative effects of bottlenecks and/or inbreeding (Verhoeven et al. 2011;Bock et al. 2015). Thus, if aquaculture populations have experienced more inter-population admixture than marina populations, this could explain their elevated levels of genetic diversity and increased invasive success in comparison to our more geographically isolated marina populations. From a selection standpoint, if neutral and functional nuclear diversity are positively correlated, then high standing genetic diversity may benefit selection and facilitate survival and spread in novel and stressful environments (Prentis et al. 2008). Indeed, greater genetic variation in ecologically important traits has been related to greater niche breadth in geographically widespread species (Sheth and Angert 2014). While potentially comprising an important mechanism of invasiveness, the influence of selection in D. vexillum cannot be properly investigated with our dataset of neutral markers, and we encourage further research using quantitative traits and functional genomic loci.
As a group, the three marina populations analysed here exhibited significantly lower allelic richness than the four populations associated with oyster aquaculture. While higher genetic diversity in aquaculture populations may be attributed to the greater scale of sampling and more diverse substrates sampled at aquaculture sites in comparison to marinas, we note that our sampling was representative of true differences between the habitat types; marina populations in general did not show extensive spread throughout the local area and did not colonise available natural substrates. Given that the introduction of D. vexillum within the UK and Ireland is spatially distributed such that populations associated with similar habitats (i.e. aquaculture vs. marina) are spatially clustered, the extent to which we are able to disentangle the relative contribution of habitat type from other regional biotic and abiotic factors is limited.
While genetic datasets can be useful to describe underlying patterns of connectivity and diversity in INNS populations, the complex ecology of many INNS make interpretations from genetic data difficult. D. vexillum undergoes both sexual and asexual reproduction (Fletcher et al. 2013), which can make interpretations of links between neutral genetic diversity and invasiveness difficult (Dlugosh et al. 2015). Uniparentally reproducing species may promote invasiveness if (1) certain clonal lineages are better adapted, (2) there is a shift in reproductive mode (Le Cam et al. 2020;Wellband et al. 2017) or, (3) quantitative genetic variation associated with polygenic traits is retained despite bottlenecks or founder events (Barrett 2015). Our data shows the highest proportion of clonal reproduction in our seemingly least invasive marina population, thereby contrasting with the hypothesis that clonal reproduction should increase invasive success. However, the high proportion of unique colonies included in our dataset suggest that there is a strong influence of multiple introductions and/or sexual reproduction in our populations. The tendency of D. vexillum colonies to fuse, forming chimeras, further complicates the possible interpretations of our dataset. For example, Watts et al. (2019) show that changes in allele frequencies produced disparate genetic groupings when comparing datasets that included versus those that excluded genetic data from chimeric colonies. Evidence of chimeras was low in this dataset (2.7%) in contrast to other regions (Clancy 2015;Watts et al. 2019;Casso et al. 2019b) suggesting that the impact of chimeras on estimations of genetic clusters would be small in our study area.
While some of the genetic mechanisms involved in the success of INNS in general (Stapley et al. 2015;Tepolt and Palumbi 2015;Hawes et al. 2018a;Gleason 2019), and D. vexillum, specifically [e.g., plasticity (Ordóñez et al. 2015), epigenetics (Hawes et al. 2018b(Hawes et al. , 2019, and microbiome effects (Casso et al. 2020)], have been identified, many questions remain unanswered. Our results highlight the importance of genetic tools in invasion management, in particular for tracking secondary spread and evaluating the risk of invasiveness in new introductions.