Metabarcoding inventory of an arctic tundra soil ecosystem reveals highly heterogeneous communities at a small scale

Biodiversity surveys of Arctic soil ecosystems are limited. Here, we provide a sequence-based inventory of soil fauna from an Arctic tundra ecosystem near the Canadian High Arctic Research Station in Cambridge Bay, Nunavut. Invertebrate communities were extracted at a vegetated and non-vegetated site in three replicates and inventoried using 18S metabarcode sequencing. A total of 361 amplicon sequence variants (ASV) were identified and assigned to the closest matching taxonomic orders, most of which belonged to the Nematoda and Arthropoda. Vegetated soils showed no significantly higher ASV richness relative to non-vegetated soils although they contained a significantly higher diversity of arthropod taxa including insects, mites, and springtails. Most taxa were found only at a single location and even samples from the same site displayed distinct communities, suggesting that belowground species richness in Arctic tundra habitats is highly heterogeneous. Preserving soil biodiversity in a changing Arctic is essential for Inuit communities who rely on intact tundra ecosystems for their health and wellbeing.


Introduction
The importance of belowground biodiversity for terrestrial ecosystem functioning is well-established (Bardgett and van der Putten 2014; Lefcheck et al. 2015), and much recent focus has been put to investigate drivers of biodiversity loss (Duncan et al. 2015). Our understanding of soil community composition and the individual species' components has increased rapidly following the application of genomic techniques , stable isotope work (Ferlian et al. 2015), and network analyses (Derocles et al. 2018;Ramirez et al. 2018). Yet, specific taxonomic knowledge is still very limited in many belowground systems (Tedersoo et al. 2014) and the biotic diversity of soil ecosystems is often overlooked in conservation policy (Guerra et al. 2021).
The association between the loss of soil biodiversity and a resulting loss of soil function is well demonstrated (Wagg et al. 2014). Key functions provided by healthy soil ecosystems include nutrient cycling and carbon sequestration (Koltz et al. 2018) in both natural and agricultural systems (Bender et al. 2016). Unfortunately, such functions are diminished as soil ecosystems become degraded, polluted, or over-exploited-leading to a loss of soil biota (Amundson et al. 2015). Human-caused changes in land use and climate directly influence soil food webs and can impact their resistance and resilience to climate change (de Vries et al. 2012;Griffiths and Philippot 2013). Biodiversity loss further directly threatens the ability of these systems to respond to environmental changes, particularly climate extremes .
Arctic ecosystems are particularly vulnerable to global and local human activity as they are experiencing some of the most rapid climate warming (Post et al. 2009). Climate change has resulted in risks to Indigenous country food systems through more unpredictable environmental conditions (Statham et al. 2015), a trend that is predicted to continue (IPCC 2022). Many Arctic regions are also seeing a rapid increase in the exploitation of their terrestrial environments for natural resources (Kumpula et al. 2011) and agriculture (Stephen 2018). The potential effects of these further environmental changes on the biota of previously uncultivated soils could be dramatic. The introduction of non-native species as a result of global warming is predicted to further endanger endemic species' diversity and lead to shifts in species' ranges (Nielsen and Wall 2013).
Concurrently, Arctic soil ecosystems have a legacy of habitation by humans, including Inuit (Keith et al. 2007), and provide a critical link in supporting traditional country foods such as caribou and muskox. Socio-economic development of high Arctic regions is predicted to intensify the use of natural resources and lead to a degradation of these environments (Ehrich et al. 2019). Links between changes in micro-invertebrate species' composition and risks to Indigenous food security have already been demonstrated (Kafle et al. 2020). Potential development in the north thus needs to be guided by an understanding of the diversity of soil invertebrates (free-living and parasitic), especially in high Arctic regions (Bach et al. 2020). Inventories of invertebrate diversity also compliment traditional Indigenous knowledge in Arctic communities, which focus on stewardship of the natural environment (Gadgil et al. 1993).
There is an urgent need to assess the species' diversity of soil invertebrates in Arctic regions and to identify potential or ongoing risks related to human activities (Pentinsaari et al. 2020). Recent large-scale analysis has shown that nematode densities in sub-Arctic regions are the highest worldwide (van den Hoogen et al. 2019). However, much of this diversity has not been classified taxonomically and knowledge of the ecological roles of many of these groups is very limited. Currently, invertebrate diversity inventories in the Canadian Arctic are only available at some long-term research sites (Rich et al. 2013) and for some highly localized sites (Culjak Mathieu 2020). Unfortunately, this may lead to a biased understanding of patterns of biotic diversity across the whole Arctic (Metcalfe et al. 2018), and may not allow local Indigenous communities to identify and respond to threats to their soil ecosystems. Over the last years, a framework for generating and curating these data is emerging (Ramirez et al. 2015). This has led to a more comprehensive understanding of climate and vegetation effects (Bastida et al. 2020) as well as distribution patterns of soil invertebrates in the Arctic (Phillips et al. 2019;Gillespie et al. 2020).
Sequencing environmental DNA directly from soils can provide an assessment of belowground diversity for Arctic soil invertebrates (Bik et al. 2012) and has been suggested as a tool specific for the monitoring of polar ecosystems and the detection of invasive species (Czechowski et al. 2017). Further, such genomic inventories can provide critical insights into the complexity of these different ecosystems, such as the presence of predatory or parasitic species' guilds (Creer et al. 2016). The composition of invertebrates in Arctic soils is known to vary depending on soil type (Hansen et al. 2016) and varying habitats respond differently to environmental change (Coulson et al. 1993). Baseline inventories and subsequent monitoring of soil invertebrates is crucial to understanding the broader consequences of environmental change for soil ecosystems, especially in the Arctic (Høye and Culler 2018).
Here, we extracted and sequenced communities of eukaryotes from soil that was collected from vegetated and nonvegetated sites near the Canadian High Arctic Research Station in Iqaluktuuttiaq (Cambridge Bay), Nunavut. This region has been populated by hunters of fish, seal, muskox, and caribou for an estimated 4000 years (Kitikmeot Heritage Society 2012). Soil inventories therefore provide a taxonomic overview of community structure and diversity for an Arctic tundra system that has been managed under Indigenous stewardship for thousands of years. It contributes to building a comprehensive framework of the biodiversity in this community, relevant to larger vertebrate species and aboveground diversity that provides subsistence for the Inuit people of this region.

Methods
Two sites with differing vegetation cover were identified near the Canadian High Arctic Research Station (CHARS) in Cambridge Bay, Nunavut (vegetated site: 69° 07′ 52.4′′ N 105° 03′ 36.8′′ W, non-vegetated site: 69° 07′ 52.3′′ N 105° 03′ 24.3′′ W, see Fig. 1). Three soil samples were collected from each site in July 2019. Samples at each site were collected within a 50 m radius, selected based on aboveground plant coverage, with Samples V1-V3 obtained from soils with vascular plant coverage and Samples NV4-NV6 obtained from soils from frost-eroded areas with little plant coverage (Fig. 1c, d). The three vegetated samples represented the main plant associations observed on the local tundra system, namely: true sedges (Carex spp.) for Sample V1; a mixture of other sedge grasses and peat moss at Sample V2; and mixed shrub/sedge grass cover for Sample V3. For each sample approximately 250 g of the topsoil was taken from directly underneath any present vegetation (at the soil surface), and collected to a depth of 10 cm. Each sample was placed into a sterile plastic bag, homogenized and transported in insulated containers to CHARS prior to freezing and subsequent transportation frozen (< − 10 °C) to Brigham Young University (BYU) for further analyses.
Once at BYU, a modified sugar-flotation protocol was used to extract micro-invertebrates larger than 40 µm from 100 g of soil from each sample (Freckman and Virginia 1993). DNA was extracted from each sample using a Qiagen DNEasy Blood and Tissue kit (QIAGEN, Germantown, MD), modified by crushing any organisms collected by centrifugation with a sterile pestle before proceeding as per the manufacturer's instructions. Sequence libraries were constructed using 10 µL of the resulting 100 µL product, which were amplified using standard Illumina sequencing primers and amplicon primer sequences for the 18S small ribosomal subunit marker (18S) as described in Caporaso et al. (2012). The 18S region amplified is the one recommended by the Earth Microbiome Project (Thompson et al. 2017) for identification of micro-eukaryotes commonly found in soils (forward primer 1391f: GTA CAC ACC GCC CGTC; reverse primer EukBr: TGA TCC TTC TGC AGG TTC ACC TAC ), and reference databases for this ribosomal sequence are increasingly found in the literature (Obiol et al. 2020). Sequencing primers and Illumina barcodes were removed before subsequent analysis using cutadapt (v3.2) (Martin 2011). Resulting metabarcoding reads were filtered, trimmed, and clustered to amplicon sequence variants (ASVs) using the DADA2 pipeline (v1.18) (Callahan et al. 2016) in R (v. 4.0.4). Reads were quality filtered and truncated at 140 base pairs for the forward and reverse compliment strands to each pair. DADA2 was used for subsequent alpha and beta diversity analyses and taxonomic assignment to both the SILVA (v 1.32, Quast et al. 2013) and PR2 (v 4, Guillou et al. 2013) databases. Specifically, both the absolute richness (Observed Features) as well as two abundance-controlled metrics (Shannon and Simpson diversity) are used to provide a ranking of the diversity of the individual samples (Lande et al. 2000). Community dissimilarities were investigated using phylogenetically naïve (Bray-Curtis distance) and phylogenetically informed distance metrics, using both abundance weighted and unweighted options (weighted and unweighted UniFrac distance, Lozupone and Knight 2005). Distance metrics were displayed in non-metric multidimensional scaling plots to observe the similarities in community composition between samples (Clarke 1993). Maximum likelihood phylogenetic trees were constructed with the phyloseq package (v. 1.34) (McMurdie and Holmes 2013) in R.
ASVs with poorly resolved taxonomic resolution in the Silva and PR reference databases were further blasted to the NCBI Nucleotide database (Schoch et al. 2020) (May 20th 2021) to be assigned to the taxonomic level for which consensus existed in this database (no other orders showed up as BLAST hits with a score higher than 200). No finer resolution than order-level taxonomy was derived from NCBI BLAST classification; hits to environmental sequences were excluded.
Following preliminary analyses, we found that Nematoda and Arthropoda were the two most commonly sequenced taxa in each sample. Accordingly, the relative frequency of sequences and the diversity of Nematoda and Arthropoda were separately considered across the samples to investigate the similarities in invertebrate community composition between soil types.

Quality control and data retention
Total reads per sample were 39,286, 41,915, 43,318, 66,907, 59,489, and 41,217 for Samples V1-V3 and NV4-NV6, respectively, before trimming, for a total of 292,132 reads. Supplementary Fig. 1 shows data loss during filtering and trimming using the DADA2 pipeline (Callahan et al. 2016). Out of 420 resulting amplicon sequence variants (ASVs) sequences 59 were identified as chimeras and removed, making up 3.83% of total reads. After chimera removal the total of reads across all samples was 176,650, for a data loss of just under 40% (see supplementary Fig. 1 for data loss in each step).

Taxonomic diversity
Based on an initial taxonomic assignment 113 out of 361 total ASVs were identified to the taxonomic Order level in the Silva database, increasing to 147 out of 361 ASVs by combining the Silva and PR databases. Including the NCBI Blast search resolved another 106 ASVs to the Order level, totaling 253 out of 361 ASVs resolved, including all but three of the 50 most common ASVs across all samples. Taxonomic insecurities for the remaining 108 ASVs and the inability to assigned detected taxa to a higher resolution than order level was made difficult by the lack of genomic references for Arctic soil biota. The 18S primers developed for the Earth Microbiome Project are designed to capture a very broad range of eukaryotic diversity but as a result lack the power to resolve species' taxonomic diversity at a very fine scale (Pawlowski et al. 2012).
The most abundant ASV at the non-vegetated site belonged to the phylum Nematoda, while at the vegetated site Arthropoda made up most of the reads (See supplementary Fig. 2). Platyhelminthes were only found at the vegetated site, and annelid DNA was recovered from Sample V1 only. Across all samples, the other most common phyla after Nematoda and Arthropoda were Rotifera, Platyhelminthes, Ascomycota, Annelida, and Tardigrada. Figure 2 presents the results of taxonomic identification across all samples for these seven most abundant phyla, which make up between 90 and 95% of the total reads for each sample (See Supplementary Fig. 2 for abundance relative to total reads). Fig. 2 Relative proportion of the most common phyla. Relative abundance is calculated as the proportion of reads of each of the seven included Phyla to the total number of reads of only those seven Phyla: Annelida, Arthropoda, Ascomycotes, Nematoda, Platyhelmintha, Rotifera, and Tardigrada. The non-vegetated sites (NV4, NV5, and NV6) showed a higher relative abundance of nematodes than arthro-pods as compared to the vegetated sites, while Platyhelminthes and annelids were only found in vegetated soils (at all three sites and in the Carex vegetation cover site only, respectively). Sample vegetation: V1) True sedge grasses (Carex spp.), V2) Sedge grass and peat moss, V3) Mixed sedge/shrub cover, V4:V6) Non-vegetated soil Nematodes Figure 3 shows the distribution of nematode orders across the two sites. Nematode diversity was not significantly higher in vegetated than non-vegetated soils (one-tailed t-test, p < 0.05). The most common ASV in the entire dataset belonged to the Enoplida and was highly abundant in all three non-vegetated soils while very rare in the soils with vegetation cover. While most enoplids are marine in origin, some members of this genus occur as free-living bacterivorous nematodes in soils (Smythe 2015). The third most common ASV belonged to the Araeolaimida and was highly abundant across all six samples. Araeolaimids are also commonly seen as marine free-living nematodes, with some in the order observed living in soils as free-living bacterivores (Yeates 1988). While these two free-living nematode examples occurred in most soil samples, other nematode ASVs were highly restricted to either site or to a single sample, such as the most common nematode ASV (belonging to Enoplida), the sixth (belonging to Mononchida), and eighth (belonging to Tylenchida) most common ASVs all being highly abundant in the non-vegetated site but nearly absent from any of the vegetated soils (see supplementary Table 1). Figure 4 shows the distribution of arthropod orders across the two sites. A higher diversity of arthropod ASVs was found in the vegetated soils than in non-vegetated soils (onetailed t-test, p < 0.05). Samples NV4-NV6 showed a high similarity in arthropod diversity, with a similar composition of collembolan and sarcoptiform arthropods. However, most collembolan ASVs were unique to one or two samples, except for the most common collembolan ASV which was present in Samples V1, NV4, NV5, and NV6. Trombidiform mites and any insects such as Diptera were only present in the vegetated soils, which were also the only samples to share the presence of the most common platyhelminth, as well as some other platyhelminths occurring only in a single sample. Generally, the vegetated site showed a much higher abundance of arthropod DNA, except for a single collembolan ASV in Sample NV5 that made up around 20% of total sample abundance.

Inter-site comparison
Seven ASVs were present in all samples, six of which were among the 30 most common ASVs. Of the 361 total ASVs, 247 were unique to a single sample and a further 31 and 18 ASVs were specific to either the vegetated or non-vegetated sampling sites, respectively. The remaining 65 ASVs were not restricted to a single sample or site. However, many were only present in very low abundances in the vegetated soils (as was the case for some of the most common nematode taxa in the non-vegetated soils). The total number of reads that were limited to a single sample or site was 32.17%, indicating that most taxa were relatively rare across samples. Strictly sample-specific taxa made up approximately 16% of total read copies, but more than two-thirds of the number of ASVs identified in this study. Figure 5 shows the alpha diversity distribution across the six samples, including the observed features measure (A) and Shannon and Simpson diversity metrics which account for the proportional abundance of species (Shannon 1948). The sample with a Carex dominated ground cover (Sample V1) consistently showed high diversity compared to the other communities, while Sample NV6 (non-vegetated) had the fewest observed features and a low biodiversity index.
Patterns of community clustering are shown in Fig. 6 for several measures of beta diversity. Two of the vegetated soils (Samples V2 and V3) clustered together in all NMDS plots, and a general distinction between the non-vegetated (Samples NV4-NV6) and the vegetated (Samples V1-V3) site was observed across all plots. The three vegetated soil samples also show higher within-group differences. The Carex-covered soil community (Sample V1) specifically was less similar to the other two samples from the same site, potentially caused by the effects of vegetation cover on community composition. Both inter-site and intra-site comparisons are broadly consistent between the different distance metrics used and these patterns were also consistent when considering nematode and arthropod diversity separately ( Supplementary Figs. 3 and 4, respectively). Figure 7 shows the phylogenetic relationship between the identified nematode and arthropod groups and their presence across the two sites. While monophyletic groups of arthropods can be identified as present only in vegetated soils (all Diptera, Hemiptera, and a subset of Collembola) most nematode clades are shown to be present in both soil types.

Discussion
The composition of the samples we analyzed showed a high degree of heterogeneity. 247 out of 361 total ASVs identified are unique to a single sample. The ecological niches of endemic taxonomic orders can be used as indicators for food web structure (Ferris et al. 2001), although it is important to caution against over-interpreting relative abundance data obtained from metabarcoding (see Schenk et al. 2019). Nematodes such as Rhabditida and Tylenchida are shown here to be prevalent in non-vegetated soil samples and are likely to be free-living bacterivores (Qing and Bert 2019), while Dorylaimida are seen more frequently in the three samples from the vegetated site where they comprise free-living or plant and animal parasitic groups (Lee 2002). Most Mononchida are predators and observed in non-vegetated soils (Ahmad and Jairajpuri 2010), occupying a higher level in the food chain in these ecosystems. Few Trombidiformes were found at the non-vegetated site while making up roughly one  third and one fourth of ASV reads in the vegetated Samples V2 and V3, respectively, suggesting that only vegetated soils host enough prey items for these predatory mites (Seniczak et al. 2020). Sarcoptiform mites were instead found at both the vegetated and non-vegetated site and are mainly expected to have a mycophagous and saprophagous ecology in the Arctic (Young et al. 2012).
The composition of arthropods in these soils show distinct communities exist between the vegetated and nonvegetated sites, with only three out of the 25 total types of Collembola shared between the two soil types, and three out of 12 total mite taxa. Many of the arthropod ASVs were only found in a single sample, indicating that some ASVs were potentially associated with specific plant communities, similar to patterns observed for bacterial and fungal diversity elsewhere in the Arctic (Bölter 2003). Both phylogenetic trees indicate a high degree of heterogeneity of nematode and arthropod taxa in these soils, particularly among the rarer observed ASVs in this study. Arthropod species with such patchy distributions are suggested as key indicator species of changing Arctic ecosystems as they are likely to be highly vulnerable to climate changes and invasive species (Hodkinson et al. 2013;Gillespie et al. 2020).
These ecosystems are warming rapidly and resulting declines in biodiversity and biomass have been demonstrated aboveground (CAFF 2013)-with further declines predicted (Niittynen et al. 2018). However, the belowground components of these ecosystems are generally Fig. 7 Phylogenies of all collected nematode and arthropod ASVs and their presence in the different soil types. 139 nematode ASVs and 51 arthropod ASVs are displayed in an unrooted phylogeny to indicate the relationships between the diversity found in the two soil types. Images are of a collembolan and nematode representative of the Kitikmeot region of Nunavut, licensed by the University of Waikato, New Zealand and the Centre for Biodiversity Genomics, Guelph, CA-respectively. Green dots indicate the ASV was found in a vegetated soil sample while orange dots indicate its presence in one of the non-vegetated soils. Size of the dots indicate the number of reads found in that sample. Colored boxes indicate taxonomic order. Gray lines are unassigned, blue lines deviate from the group otherwise indicated by color less well known. The diversity of belowground systems is a driving force behind the above-ground diversity of plants (Frouz 2018) and is therefore of great importance in supporting country food such as muskox and caribou. Arctic microinvertebrates have also been shown to directly affect this food supply through the transmission of diseases such as brucellosis in wild muskox (Tomaselli et al. 2019), and arthropods such as biting flies can cause behavioral changes in large mammals (Witter et al. 2012). Soil microbiota thus play key roles in Arctic foodwebs supporting the health and wellbeing of Indigenous communities (Pedersen et al. 2020). Understanding the dispersal mechanisms (Coulson et al. 2003) and responses of below-ground fauna to climate and vegetation changes (Joly et al. 2009) can provide Indigenous communities with crucial knowledge for responding to changing environmental conditions.
Our study provides an inventory of belowground invertebrate diversity from the Canadian central Arctic. Many of the ASVs we identified in this study were found only in a single sample and point to a high degree of heterogeneity on the small spatial scale at which we investigated these belowground invertebrate communities. While some ecological information can be inferred from metabarcoding (Makiola et al. 2020), the incomplete nature of publicly available genetic repositories and biases in geographic and taxonomic representation can reduce confidence in individual species identification and their function in local habitats. Similar and ongoing studies for above-ground invertebrate communities (Pentinsaari et al. 2020) have also highlighted similar issues with existing morphological and molecular taxonomic databases. Our study and ongoing genomic and taxonomic characterization of the eukaryotic soil diversity in the Arctic are first steps in redressing this knowledge gap.
Acknowledgements Financial and logistic support for this research was provided by Polar Knowledge Canada. Collembolan pictures were provided by Barry O'Brien.
Author contributions IH, DW and BA conceived research objectives. JJ and BA designed research approach. BV and IH collected samples. JJ processed samples and analyzed data. JJ wrote the first draft of the manuscript. All authors contributed to the interpretation of data and subsequent drafts.

Code availability
Code used for data analysis is available as a supplementary file (Sup File 1).

Conflict of interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.