Genome-wide SNP analysis of three moose subspecies at the southern range limit in the contiguous United States

Genome-wide evaluations of genetic diversity and population structure are important for informing management and conservation of trailing-edge populations. North American moose (Alces alces) are declining along portions of the southern edge of their range due to disease, species interactions, and marginal habitat, all of which may be exacerbated by climate change. We employed a genotyping by sequencing (GBS) approach in an effort to collect baseline information on the genetic variation of moose inhabiting the species’ southern range periphery in the contiguous United States. We identified 1920 single nucleotide polymorphisms (SNPs) from 155 moose representing three subspecies from five states: A. a. americana (New Hampshire), A. a. andersoni (Minnesota), and A. a. shirasi (Idaho, Montana, and Wyoming). Molecular analyses supported three geographically isolated clusters, congruent with currently recognized subspecies. Additionally, while moderately low genetic diversity was observed, there was little evidence of inbreeding. Results also indicated > 20% shared ancestry proportions between A. a. shirasi samples from northern Montana and A. a. andersoni samples from Minnesota, indicating a putative hybrid zone warranting further investigation. GBS has proven to be a simple and effective method for genome-wide SNP discovery in moose and provides robust data for informing herd management and conservation priorities. With increasing disease, predation, and climate related pressure on range edge moose populations in the United States, the use of SNP data to identify gene flow between subspecies may prove a powerful tool for moose management and recovery, particularly if hybrid moose are more able to adapt.


Introduction
Moose (Alces alces) are the sole extant members of the Alces genus and are widely distributed among subarctic regions of the northern hemisphere (Geist 1998;Hundertmark and Bowyer 2004). In North America, four subspecies have been described based on biogeography and morphology (Peterson 1955;Hall 1981), with three of the subspecies' boundaries extending south into the contiguous United States (U.S.): A. a. shirasi (Shiras moose), A. a. andersoni (Western moose), and A. a. americana (Eastern moose), and the fourth subspecies, A. a. gigas (Alaskan moose), inhabiting Alaska and northwestern Canada (Fig. 1). With an overall estimated population of around one million in North America, A. alces are not considered to be a species of concern by the United States government (Timmermann and Rodgers 2017). Therefore, management has predominantly been state or regionally focused in an effort to maintain sustainable populations as a local natural resource (Wattles and DeStefano 2011;Timmermann and Rodgers 2017).
While some moose subpopulations are expanding, others (notably along the southern periphery) are in steep decline due to habitat loss and reduced forage quality (Timmermann Jason A. Ferrante and Chase H. Smith contributed equally to this work.

3
and Rodgers 2017; Schrempp et al. 2019), increased predation (Mech and Fieberg 2014;Timmermann and Rodgers 2017), and increased prevalence of disease (Samuel 2007;Lankester 2010;Ditmer et al. 2020;Ellingwood et al. 2020). In response, some local managers and state governments have instituted regulations on hunting and other activities, including an instance of ceasing permits for radio collar telemetry-based population studies in Minnesota over concerns that "unintended and unanticipated mortality of moose occurred during the collaring process" (Minnesota, Executive Office of the Governor [Mark Dayton] 2015; Phillips 2021). As management authorities seek to reduce the rate of decline, understanding the genetic makeup of these peripheral moose populations is key to prioritizing conservation actions at the edge of the species' range.
Range edge populations are often characterized by reduced effective population sizes (Piry et al. 1999;Bouzat 2010) and increased mating of related individuals (Neaves et al. 2015). The subsequent reduction in genetic diversity may reduce the fitness of individuals and the evolutionary potential of a species, thereby increasing the probability of population extinction (Bouzat 2010;Bijlsma and Loeschcke 2012;Mimura et al. 2017). Low genetic diversity is a known trait among the North American moose populations and likely reminiscent of a founder effect (Hundertmark and Bowyer 2004). The North American moose population is estimated to have been founded 11 to 14 Ka as a small population of moose entered the Americas via the Beringian land bridge (Guthrie 1995;Hundertmark et al. 2002b;Meiri et al. 2014). Subgroups of moose are hypothesized to then have dispersed throughout North America via rare, long-distance (leptokurtic) dispersal events (Hundertmark et al. 2003), extending now into many northern U.S. states. Moose in North America may have maintained inherently low levels of genetic diversity since their founding, making it important to take into account their demographic history when evaluating levels of inbreeding and genetic health. Investigating current genetic diversity among range edge moose populations will provide a baseline for evaluating the impacts of climate change on disease prevalence and habitat quality.
Since the 1990's, genetic studies of moose have relied on cytogenic, mitochondrial DNA (mtDNA), and microsatellite markers to investigate genetic structure and subspecies designations (Boeskorov 1997;Hundertmark et al. 2002aHundertmark et al. , 2003Hundertmark and Bowyer 2004;DeCesare et al. 2020), though genome-wide analyses are rare (e.g., Kalbfleisch et al. 2018). While mtDNA and microsatellites have been useful for elucidating subspecies boundaries, the use of genomic techniques provides increased capability for local-scale analysis that can better inform state and regional conservation and management planning (e.g., Funk et al. 2012Funk et al. , 2019Coates et al. 2018). In particular, genotyping-by-sequencing (GBS) approaches (e.g., restriction enzyme-based sequencing approaches) are increasingly used in ecological and evolutionary research (Kjeldsen et al. 2016;Foote and Morin 2016;Ba et al. 2017;Cammen et al. 2018). These methods yield thousands of molecular markers to better resolve genetic differences, without the need of a published genome, thus enabling a more comprehensive investigation of genetic diversity and structuring of populations (Davey et al. 2011;Kjeldsen et al. 2016;Roffler et al. 2016;Yang et al. 2016;Mérot et al. 2020).
Despite its many benefits, the use of SNP methods to investigate population genetic variation on moose has, to date, focused primarily on a small number of populations in Scandinavia (Nichols and Spong 2017;Blåhed et al. 2019). In the United States, Kalbfleisch et al. (2018) also analyzed moose SNPs, but their study was limited to four individuals across three subspecies (A. a. gigas, andersoni, and shirasi). For our study, we performed a SNP-based analysis to assess current subspecies designations and inter-subspecies genetic structure for the three recognized moose subspecies found in the contiguous United States (A. a. americana, andersoni, and shirasi). Our opportunistic sampling targeted the southernmost range of each subspecies, establishing baseline SNP profiles for moose populations most susceptible to environmental change. Herein, we describe our preliminary findings on the genetic diversity and structuring of these three subspecies with the goal of informing management and recovery efforts at regional and state scales.

Sample collection
We obtained a subset of moose tissue and blood samples collected between 2009 and 2017 from Idaho (ID), Montana (MT), Minnesota (MN), New Hampshire (NH), and Wyoming (WY) (Fig. 1). Samples included fresh blood preserved in Tempus™ Blood RNA Tubes (ThermoFisher, Waltham, MA), archived whole blood banked in -80 °C freezers, and tissue banked in -20 °C freezers or desiccant (Ferrante et al. 2021). All samples had been collected for research, during capture for radio collaring efforts or health assessments, carcass recovery, or provided by hunters to the state agencies, and stored at the collaborator's respective facilities (Table 1). Samples from A. a. shirasi were obtained from multiple states along almost the entire latitudinal range of the subspecies in the United States. Samples from A. a. andersoni were obtained from the northwest and northeast regions of Minnesota, where the subspecies population in the United States is primarily located. Alces a. americana had less representative sampling, consisting solely of a collection effort encompassing a small region in New Hampshire.

DNA extraction and quality control
DNA was extracted from tissue or whole blood using DNeasy kits (Qiagen, Germantown, MD) following manufacturers protocols, or phenol-chloroform isoamyl alcohol isolations from whole blood collected in Tempus tubes following Ferrante et al. (2018). DNA quantity was assessed by spectrometry using an Epoch™ microplate spectrophotometer (BioTek, Winooski, VT) and integrity was ensured by gel electrophoresis using a 0.8% agarose gel stained with ethidium bromide. If there was no visible indication of fragmentation, 1 µl of each sample was incubated in a Cut-Smart® restriction enzyme buffer (New England BioLabs, Ipswitch, MA) at 37 °C for 2 h and visualized by gel electrophoresis as a final quality control check prior to sequencing.

Genotyping by sequencing
The DArTseq GBS approach, developed and implemented by Diversity Arrays Technologies (DArT; www. diver sitya rrays. com, Bruce, Australia) is useful for genetic mapping and genome wide diversity analyses of species for which no reference genome is available, as was the case with A. alces. Genotyping by sequencing was conducted by DArT using ~ 600 ng of moose DNA. Samples were digested with restriction enzymes PstI and SphI following Wenzl et al. (2004) prior to 75 base pair single end sequencing on an Illumina HiSeq 2500 (San Diego, CA). Quality filtering and SNP reporting was conducted using the proprietary software DartSoft14 (Diversity Arrays Technology, Bruce, Australia) pipeline.
Additional data filtering steps were performed in RStudio (RStudio Team 2019) using the R package 'dartR'' v 1.1.11 ; Gruber and Georges 2019) following similar methods as previous studies Smith et al. 2021). In the following order, SNPs or individuals were removed when: (1) the SNP marker reproducibility (the proportion of alleles that give a repeatable result at a locus) was below 1.00, (2) there was more than one SNP per locus (i.e., the SNP with the highest degree of polymorphism was retained), (3) the locus had greater than 10% missing data, (4) individuals were missing greater than 10% of data, (5) the minimum allele frequency was less than 0.05, and (6) the loci deviated from Hardy-Weinberg equilibrium (HWE) (p = 0.0001).

Genetic analyses
We visualized the distribution of genetic variation among SNPs in the moose subspecies using principal coordinates analysis (PCoA) in the R package 'adegenet' v 2.1.1 (Jombart 2008; Jombart and Ahmed 2011). We also used modelbased approaches to identify patterns of genetic structure via the Bayesian clustering algorithm STRU CTU RE v 2.3.4 (Pritchard et al. 2000) and the non-negative matrix factorization algorithm TESS3 (Caye et al. 2016). For STRU CTU RE, we ran 20 iterations where K, the number of clusters (or populations), ranged from 1 to 7 (following lower rep test runs to K = 10 classifying structure ~ K = 2 or 3), with an initial burn-in period of 250,000 reps followed by 1 million Markov chain Monte Carlo (MCMC) repetitions. The STRU CTU RE analysis was carried out on the USGS Yeti supercomputer (Falgout and Gordon 2015). We identified the most appropriate K value in STRU CTU RE HARVESTER, which characterizes the data by the set of allele frequencies at each SNP and maximizing Hardy-Weinberg equilibrium within clusters (François and Durand 2010), by considering the ΔK method (Evanno et al. 2005), the least negative mean log likelihood (Ln P(D)), and the known geographic distribution of the moose sampled (Pritchard et al. 2000;Earl and vonHoldt 2012). Results were plotted using STRU CTU RE Plot v2.0 (Ramasamy et al. 2014) and it was deemed notable when samples had a q-value, or the proportion of SNPs shared, was less than 0.8 within their cluster (> 20% shared with another cluster). TESS3 was used to integrate collection localities with genotypic data. The program computes ancestry proportions distributed over geographic space and is useful in addition to STRU CTU RE for identifying population structure (Caye et al. 2016). TESS3 analyses were performed in the R package 'tess3r' v 1.1.0 (Caye et al. 2018) with the inclusion of geographic coordinates for each sample, and we modeled K = 1-7. Cross-validation criterion was used to select the most likely K and ancestry proportions were mapped on an ancestry matrix.
Pairwise F ST values were calculated using the R package 'StAMPP' v 1.5.1 (Pembleton et al. 2013) and 999 bootstrap replicates to estimate population differentiation

Results
DArTseq analysis identified 9,780 SNPs across 159 moose samples after quality filtering (Table 1). Our additional filtering (Ferrante et al. 2021) yielded 1,920 SNPs from 155 moose (ID: n = 10; WY: n = 25; MT: n = 29; MN: n = 38; NH: n = 53) which were found to be adequate for downstream analysis ( Table 2). The PCoA depicted three clusters congruent with the three taxonomically identified subspecies (Fig. 2). The first two principal components (of 32 total) explained 25% of the variation (20% and 5%, respectively), with the 3 rd and remaining components explaining 2.5% or less each (not shown). Notably, a subgroup of moose samples (n = 12) identified as collected from northern MT (hereafter NMT) were observed separated from the main cluster of A. a. shirasi moose in the direction of the A. a. andersoni cluster (Fig. 1). The STRU CTU RE HARVESTER analysis best classified the moose population structure as having K = 2 or K = 3 clusters. The ΔK value peaked at K = 2 ( Fig. 3A) with one cluster representing A. a. shirasi and the other A. a. andersoni and A. a. americana (Fig. 3B). The log probability of the data plateaued at K = 3 (Fig. 3A) and groupings aligned with existing subspecies designations and PCoA results (Figs. 2,  3B). In the K = 3 structure, all moose samples from the NMT region within A. a. shirasi displayed between 22 and 49% shared ancestry proportions with the A. a. andersoni samples and were congruent to the observed group separated from the A. a. shirasi cluster in the PCoA analysis (Fig. 2). No other moose samples displayed > 20% shared ancestry with another cluster (Fig. 3B). TESS3 analyses supported a K = 3 clustering (cross-validation plot; Fig. 4 inset), closely aligning with the results from the PCoA and STRU CTU RE analyses. All samples were assigned within their designated subspecies, and again, a high proportion of shared ancestry (> 20%) was observed between the NMT samples and the A. a. andersoni moose (Fig. 4, dotted rectangle) (Table 3b). The remaining SMT moose samples did not display such a change in F ST values, indicating low differentiation between the subpopulations (F ST < 0.05). Measures of genetic diversity among subspecies were similar, with AR ranging from 1.801 to 1.901, and observed (H O ) and expected (H E ) heterozygosity ranging from 0.229 to 0.268 and 0.253 to 0.268, respectively. The inbreeding coefficient (F IS ) was also low overall (range 0.010 to 0.099) among the  (Table 3c).

Discussion
Our study produced the first SNP-based estimates of genetic diversity for moose, which identified low baseline values for all three subspecies. Genome-wide evaluations of genetic variation are powerful approaches which can inform subspecies designations, provide a basis for understanding processes driving molecular diversification, and inform effective management and conservation strategies. Our GBS approach produced nearly 2,000 SNP loci for the genomic evaluation of U.S. moose without the need of a reference genome, and provided an expansive genomic evaluation in support of previous microsatellite and mtDNA approaches. Although our sampling design was limited to opportunistic live sampling and archived samples, our initial survey resulted in robust estimations of genetic diversity, patterns of genetic structure, and a putative hybrid zone between two moose subspecies-all of which will be useful in conservation and management planning, as discussed below.

Patterns of genetic variation in moose
The primary goal of this study was to perform a SNP-based analysis of North American moose at the southern range edge of their distribution, where regional subpopulation loss has been occurring. The results of our analyses support current subspecies designations (Figs. 2, 3, 4) and added insight to existing diversity estimates. For instance,  (Table 3a). Our limited sampling allowed for some intra-subspecies analysis, specifically for A. a. shirasi, although our samples likely do not represent the full SNP diversity within each subspecies. We therefore caution over-interpretation to the level of entire subspecies populations, as our study was limited by varied sampling areas, and our ability to collect samples did not extend to the limits of each subspecies' putative boundaries. Therefore, future research targeting these areas would be beneficial for delineating potential contact zones (see below discussion on hybrid zones). However, within trailing edge subpopulations, our SNP data did allow us to make some interesting comparisons. It is intriguing that the subspecies sampled within the smallest geographic footprint (A. a. americana) had the lowest inbreeding value (F IS = 0.01), suggesting that DNA was collected from unrelated individuals. Northern regions of NH are characterized by moderate moose densities associated with favorable habitats (NH Fish and Game Commission 2015) and it is, therefore, possible that sampling  Fig. 3 Results of the Bayesian clustering analysis using the program STRU CTU RE for 1,920 SNPs from moose (Alces alces subspp) samples collected across five states (ID, WY, MT, MN, and NH). A STRU CTU RE HARVESTER graph displaying mean LnP(K) values (right y-axis), and the Delta K values (left y-axis). The x-axis shows the associated K cluster value. K = 2 and K = 3 indicated as likely number of clusters in sample population. B STRU CTU RE plots showing the proportion of individual moose membership (y-axis) in each cluster represented by a different color. Support was found for both the K = 2 and K = 3 clustering results with the sample subspecies designations aligning well, suggesting some level of genetic differentiation observed at this resolution. A. a. shirasi samples with > 0.2 ancestry proportion shared with A. a. andersoni from area in northern Montana labeled as NMT was less biased (i.e., less likely sampling of kin) than might be expected given the small geographic footprint. Consequently, our results suggest that relatively localized subpopulations, including the one we surveyed in NH, may not necessarily be at risk of negative effects like inbreeding depression often seen in small, isolated subpopulations. It should be noted that SNP assessments have been shown to miss detecting inbreeding (F IS ) that otherwise was apparent via microsatellite analysis (Zimmerman et al. 2020). As such, we feel our findings warrant more sampling from this region to test these results. Additionally, this result may not apply to all moose subpopulations in the contiguous United States. For example, we see cases of geographically isolated moose having higher inbreeding values as expected, such as the A. a. andersoni population on Michigan's Isle Royale (not included in this study) whose F IS values have doubled between 1960-1965 (F IS = 0.08) and 2000-2005 (F IS = 0.16) due to lack of gene flow from neighboring populations (Sattler et al. 2017).
A critical aspect of conservation biology is to establish patterns of genetic diversity across geographic ranges of species (Allendorf et al. 2013). Low genetic diversity can affect the health of a species or population and may carry a risk of extinction, but such outcomes may be overstated as a certainty in conservation genetics (Holderegger et al. 2006;Yıldırım et al. 2018;Teixeira and Huber 2021). In the case of moose in the contiguous U.S., low genetic diversity estimates are likely indicative to the demographic history of North American moose or sampling design rather than inbreeding. SNP-based estimates depicted lower heterozygosity and allelic richness when compared to values calculated using microsatellites in other portions of the North American range, including Alaska (Schmidt et al. 2009;Wilson et al. 2015;DeCesare et al. 2020), western Canada (DeCesare et al. 2020, and eastern Canada (Wilson et al. 2003). However, many of these studies focused on populations more latitudinally centered in each subspecies' range. Investigations of genetic patterns in species populations, including in North American ungulates, have generally found genetic diversity to be lower at the margins and greater towards the center of a species' range (Eckert et al. 2008;Thompson et al. 2019). This is often due to isolation by distance or inbreeding as a result of multiple biotic and abiotic factors (i.e. fragmented habitats, low gene flow) working against them such as fragmented habitats, low gene flow, etc. (Kawecki 2008;Hundertmark 2009). Estimates of genetic diversity in moose appear to follow a similar trend; microsatellite-based estimates from moose sampled closer to their range-edges were more comparable to our SNP data estimates (DeCesare et al. 2020). Alternatively, microsatellite-based methods may have Ancestry ProporƟons NMT inflated estimates of genetic diversity throughout the range of North American moose. The higher mutation rate and multi-allelic nature of microsatellites could lead to overestimates of genetic diversity (Landegren et al. 1998;Zimmerman et al. 2020). Recent studies support the use of SNPs as compared to microsatellite data for estimates of populationlevel diversity due to their increased ability to differentiate between clusters (Zimmerman et al. 2020). Further assessment of moose throughout their subspecies distribution with SNPs will be important to characterize genetic variation for each subspecies across their range and to test whether microsatellite-based methods are overestimating genetic diversity.

Evidence of potential gene flow between A. a. andersoni and A. a. shirasi
Our analysis indicated that A. a. shirasi samples from NMT share a notable proportion of SNPs with A. a. andersoni samples (Figs. 2, 3B, 4), potentially indicating recent or ongoing gene flow, divergence within A. a. shirasi, or clinal variation (sensu Chafin et al. 2021) between the two subspecies. DeCesare et al. (2020) recently characterized the last century of moose populations in Montana using microsatellite and mtDNA markers, and found evidence that hunting pressure in the middle of the state locally extirpated moose, dividing the moose into northern and southern subpopulations that persisted through the twentieth century, even as moose populations recovered. This human-mediated vicariance event, paired with recent contact with populations of A. a. andersoni across central/western Canada, may explain the genetic distinctiveness of NMT moose. Our findings align with that study in the evidence for potential contact zones between A. a. shirasi in NMT and A. a. andersoni in south central Canada (DeCesare et al. 2020). SNP analysis of additional samples from central Canada and the Canadian Rockies may provide the geographic coverage needed to resolve the demographic history of these two populations, including patterns of connectivity, interbreeding, and directionality of gene flow.

Future directions and implications on conservation
Due in part to their relatively large populations in North America, moose have not been granted U.S. federal protection, even as a number of local populations have rapidly decreased. Conservation efforts currently rely on state or interstate management decisions, or on individual management zones (e.g., parks, conservation areas, or protected 1 3 lands). The results of our study provide baseline subspecies genetic SNP diversity data for monitoring moose genetic health at a local level. This may aid in long-term monitoring of existing genetic diversity, assessing the effects of environmental change on local moose populations, and possibly inform human-mediated actions to improve genetic diversity for future subpopulations. Essentially, states or regional management entities can use this expanded genomic data to coordinate management at the molecular level. Our findings suggest edge subpopulations of A. a. andersoni and A. a. shirasi in northern Montana likely have had opportunities for gene flow. The implications of overlapping subspecies boundaries for local management could benefit from multi-state management considerations for certain populations. This includes consideration of movement corridors between populations and decisions related to hunting limits and land use. Such ecological passageways, where admixture may occur, could be valuable for conservation, and potentially support the adaptive capacity of moose in the face of climate change (Rosvold et al. 2013;Hendricks et al. 2019).
Contemporary moose populations are facing warming average temperatures which affects their forage quality and increases their exposure to diseases and parasitic infestations. This is evidenced by increased meningeal worm (Parelaphostrongylus tenuis) infestation through the expansion of white-tailed deer into more northerly habitats, and seasonal changes that support higher winter tick (Dermacentor albipictus) survival (Murray et al. 2006;Lankester 2010;Carstensen et al. 2019;Ellingwood et al. 2020). Genetic variation plays a significant role in facilitating resilience to a changing environment, and evolutionary adaptation is an important process for improving an individual's fitness under such selective forces (Hendry et al. 2008).
Many of the population genetic assessments on moose, including our study, have focused on using neutral genetic variation to resolve demography of populations over time (e.g., phylogenetic relationships, mutation, gene flow). However, considering the anthropogenic and natural threats moose currently face, future studies concerned with moose genetic diversity might benefit from the investigation of functional genetic diversity, or the measure of diversity of potentially adaptive markers, by using RNA-seq and wholegenome resequencing (WGR) methods (Brodie et al. 2021). This could be accomplished using the recently sequenced European moose genome (Dussex et al. 2020), or optimally, by first sequencing the nuclear genome of the North American moose, and then looking to see which SNPs/genes appear most impacted. Understanding both the genomic and transcriptomic characteristics of successful moose subpopulations, particularly in the face of increasing stressors, may highlight which genes or genomic regions are influencing their adaptive capacity. Moreover, identifying gene flow between subspecies and the potential resultant admixture of moose subspecies may have similar implications for management in the context of a changing environment, particularly if hybrid moose are more able to adapt. The integration of high-resolution genomic data from SNP analyses into these studies can have a high value to conservation and management planning for moose (Funk et al. 2019), as reintroduction or population augmentation can be restricted to genotypes that maximize survivability in specific environments. Based on the results of such research, conservation efforts may benefit from a shift to include special protection for overlapping habitat in the future.

Conclusion
The application of SNP analysis methods to targeted study areas across moose species or subspecies ranges provides the type of robust data needed to inform herd management and conservation priorities. With increasing pressure on moose populations in the United States, the ability to understand genetic adaptation mechanisms is becoming even more important. Specifically, the use of SNP data increases the power of future research on moose demographics including the leptokurtic (rare, long-distance) dispersal hypothesis (Hundertmark et al. 2003;MacLeod et al. 2013) and identification of gene regions under selection. We envision further SNP analyses of this type including A. a. americana samples from a broader geographic range, moose samples from each subspecies range within southern Canada, and A. a. gigas samples from Alaska, thus allowing for a more fully resolved demographic characterization of the current North American moose population with specific benefit to local subspecies populations.