Primer biases in the molecular assessment of diet in multiple insectivorous mammals

Our understanding of trophic interactions of small insectivorous mammals has been drastically improved with the advent of DNA metabarcoding. The technique has continued to be optimised over the years, with primer choice repeatedly being a vital factor for dietary inferences. However, the majority of dietary studies examining the effect of primer choice often rely on in silico analyses or comparing between species that occupy an identical niche type. Here, we apply DNA metabarcoding to empirically compare the prey detection capabilities of two widely used primer sets when assessing the diets of a flying (lesser horseshoe bat; Rhinolophus hipposideros) and two ground-dwelling insectivores (greater white-toothed shrew; Crocidura russula and pygmy shrew; Sorex minutus). Although R. hipposideros primarily rely on two prey orders (Lepidoptera and Diptera), the unique taxa detected by each primer shows that a combination of primers may be the best approach to fully describe bat trophic ecology. However, random forest classifier analysis suggests that one highly degenerate primer set detected the majority of both shrews’ diet despite higher levels of host amplification. The wide range of prey consumed by ground-dwelling insectivores can therefore be accurately documented from using a single broad-range primer set, which can decrease cost and labour. The results presented here show that dietary inferences will differ depending on the primer or primer combination used for insectivores occupying different niches (i.e., hunting in the air or ground) and demonstrate the importance of performing empirical pilot studies for novel study systems.


Introduction
In a constantly changing environment, knowledge of complex food webs is vital for our understanding of ecosystem functioning and biodiversity conservation. The advent of Next-Generation Sequencing (NGS) technology has revolutionised the analyses of trophic interactions (Deagle et al. 2019;Browett et al. 2020), with DNA metabarcoding (i.e., simultaneous identification of multiple species using a standardised region of DNA) of faecal samples or gut contents becoming widely adopted for describing diets (Pompanon et al. 2012). Despite the significant developments and improvements afforded by DNA metabarcoding for dietary studies over the last decade, the technique has certain limitations. These include problems in describing diverse diets (e.g., omnivorous species); assigning sequences to appropriate taxonomic levels with incomplete or poor reference databases; false negatives/positives for species detections, 1 3 and host co-amplification (Piñol et al. 2015;Alberdi et al. 2019;Deagle et al. 2019).
Several of these limitations are particularly evident when studying the diets of mammalian insectivores in terrestrial environments. Invertebrates are massively diverse and widely distributed (Stork 2018), which makes describing invertebrate-based diets via DNA metabarcoding challenging. Given that insectivores can potentially have a broad diet (Brown et al. 2014), a key consideration is the choice of primers to use due to varying detection capabilities (Corse et al. 2019), or target only specific invertebrate groups (Saitoh et al. 2016). To capture the expected wide range of invertebrate taxonomic groups, highly degenerative (nonspecific) primers can be used, but studies comparing their efficiency have largely been restricted to analyses performed in silico (Piñol et al. 2018) or using bulk samples and/or mock communities (Elbrecht et al. 2019). While these are essential steps in primer design and have led to the ability to detect a wide range of invertebrate species, they may not account for some of the potential biases within a dietary context (i.e., predator/host amplification; Zeale et al. 2011). The broader the taxonomic range of the primers, the more likely the chance of amplifying non-target taxa and reducing the amount of information on a species diet.
In terms of insectivorous mammalian predators, bats are well represented in dietary DNA metabarcoding studies due to their ecological importance and their significant role in the suppression of insects, e.g., pests and vectors implicated in the spread of disease that may negatively impact agriculture (Taylor et al. 2018). They have not only served as a key study group for primer comparisons, but also for methodological development such as sampling design, evaluation of setting clustering thresholds for Molecular Operational Taxonomic Unit (MOTU), and mitigating contamination/errors (Alberdi et al. , 2019. Applying these measures can result in the detection of hundreds of species in a bat's diet without losing information to host co-amplification (although it is worth noting that host co-amplification can benefit a bat dietary study by simultaneously detecting a wide range of prey taxa and confirming the predator species from faecal samples; Galan et al. 2018;Tournayre et al. 2020). Although investigations into the diets of ground-dwelling and semi-aquatic mammalian insectivores using DNA metabarcoding are less frequent, recent studies have included comparisons of primer combinations and host/diet detection (Brown et al. 2014;Esnaola et al. 2018) and those focusing on resource overlap between different insectivores (Brown et al. 2014;Biffi et al. 2017a). Studies searching for the 'best' primer combinations tend to have been performed on a single insectivore niche (e.g., flying or semiaquatic). While it has been acknowledged that the best primer combination for detecting invertebrate prey in one system may not be the best for another (Tournayre et al. 2020), there has been a lack of studies investigating this directly. It is, therefore, important to directly compare the effect of various primers on multiple insectivores occupying different ecological systems (Corse et al. 2019).
Here, we apply DNA metabarcoding to examine the diet of three mammalian insectivores with two widely used primer pairs Gillet et al. 2015) targeting the mitochondrial Cytochrome C Oxidase Subunit 1 (COI) region (chosen due to its high taxonomic coverage, resolution and well-defined reference database; Clarke et al. 2017;Elbrecht et al. 2019). These primer pairs differ in terms of prey identified (dietary constituents) and predator (host) amplification (Esnaola et al. 2018;Aldasoro et al. 2019). The three focal insectivores were chosen based on ecological niche and their proposed broad diet. The lesser horseshoe bat (Rhinolophus hipposideros) was used to represent a flying predator, while the pygmy shrew (Sorex minutus) and greater white-toothed shrew (Crocidura russula) were used to represent ground-dwelling predators. The diet of lesser horseshoe bats is known to be highly diverse, with 11 orders identified overall but largely dominated by Diptera and Lepidoptera as shown by both hardpart and DNA metabarcoding analyses Baroja et al. 2019;McAney and Fairley 1989). Their diet also changes by season and locality (McAney and Fairley 1989;Baroja et al. 2019). Pygmy shrews have a diet consisting of 12 identified orders from multiple hard-part dietary analyses, with Araneae, Coleoptera, and Opiliones highly represented across different parts of the species' range (Meharg et al. 1990;Churchfield and Rychlik 2006). A recent shotgun metagenomics study (not to be confused with the metabarcoding approach used here) on five individuals also identified the importance of Lepidoptera and Acari (Ware et al. 2020). Detailed studies of the greater white-toothed shrew's diet are limited, but Lepidoptera larvae, Araneae, and Isopoda are important components of the species' diet in Europe (Bever 1983). The species is known to catch vertebrates (including reptiles, amphibians, and young small mammals; Churchfield 2008), but concrete evidence of predation is lacking. Lizards/geckos have occasionally been recovered from stomachs of the species in its African range, but it is unclear if this is due to predation or scavenging (Brahmi et al. 2012).
Focusing on these three different species, our main objective in this study was to establish whether different primer sets (or a combination of these primer sets) are appropriate for detecting different trophic niches in multiple insectivorous mammals.

Sample collection and DNA extraction
The bat and shrew samples used here are a subset of samples obtained from larger, separate studies that required some differences in methodologies for sample collection and DNA extraction. These differences have been acknowledged during interpretation of the results.
Bat faecal samples were collected non-invasively by Harrington (2018) at known bat roosts along their distribution range in the west of Ireland (Fig. S1). Sampling of bat roosts was carried out under licence from the NPWS (licence number DER/BAT 2016-29). Large sheets of plastic were laid on the ground within each roost (hidden from sunlight) and left for a period of 1-2 weeks. Droppings were collected and stored frozen at − 20 °C or DNA extracted within 24 h using the Zymo Research Genomic DNA™-Tissue MicroPrep kit following the protocol used for faecal DNA extraction in Harrington et al. (2019). Across the six locations sampled for R. hipposideros (Fig. S1), a total of 1341 faecal samples were collected. Each DNA extract was identified to species level using a species-specific real-time PCR assay (Harrington et al. 2019) and a total of 231 R. hipposideros faecal samples were identified to individual level using a panel of seven microsatellite markers originally designed by Puechmaille et al. (2005), and redesigned and optimised to work efficiently with faecal DNA by Harrington (2018) via two multiplex PCRs. Each sample was amplified, analysed, and scored via three independent PCRs. A total of 24 individuals identified as R. hipposideros in Harrington (2018) were used in this study. The samples have high-quality DNA, were identified to individual level, sex-typed, and were split evenly across location (n = 4 in each of the six points) and sex.
Pygmy shrews (S. minutus) and greater white-toothed shrews (C. russula) were trapped from hedgerows along secondary and tertiary roads adjacent to agricultural land in Ireland and Belle Île (France; Fig. S1). Shrews were immediately euthanised by cervical dislocation following guidelines set out by Sikes (2016) and under licences C21/2017, AE18982/I323 (Ireland) and A-75-1977 (Belle Île), and ethical approvals ST1617-55 and AREC-17-14. The shrew samples are part of a larger study that required this sampling method. Carcasses were stored in separate disposable bags in a cooler until dissection later that day (max. 10 h). The entire gut (gastrointestinal) tract was removed and stored in absolute ethanol at a 1:4 (sample:ethanol) ratio (Egeter et al. 2015). To avoid cross-contamination, all dissections were performed on disposable bench covers and all tools were cleaned and flamed between samples. Gut contents were stored at -20˚C upon returning from the field to the lab (max. 12 days). Gut tracts were defrosted on ice, removed from ethanol, and air-dried. Gut contents were removed from the intestines on disposable bench covers and tools were cleaned and flamed in between each sample to avoid cross-contamination. DNA was extracted from the entire gut contents using the DNeasy Power Soil Kit (Qiagen). DNA extractions were quantified using the Qubit dsDNA BR assay kit (Thermo Fisher Scientific) and subsequently diluted in molecular grade water to 10-15 ng/µl. A subset of 12 C. russula (10 from Ireland, and 2 from Belle Île) and 15 S. minutus (10 from Ireland, and 5 from Belle Île) samples were chosen for this study. This subset of shrews was chosen to represent multiple sample sites from a larger study on shrews in Ireland and Belle Île. In total, 51 insectivores were analysed, including 27 ground-dwelling and 24 flying individuals. Sampling locations of the shrews used can be found in Table S1 and Fig. S1.

Polymerase chain reaction (PCR)
DNA extracts were amplified using two primer sets targeting different short fragments of the mtDNA COI gene. The The two pairs of primers will be referred to as the Zeale and Gillet primer sets and datasets from here on. A set of 24 unique eight base pair multiplex identifiers (MID) tags were added to the Zeale and Gillet primer sets to allow for the multiplexing of samples into a single sequencing run. A different set of 24 unique MID tags were used for each primer pair.
The PCR mix for both Gillet and Zeale primer sets contained 12.5 µl Qiagen Multiplex PCR Mastermix, 1 µl of each primer (5 µm), 7.5 µl of molecular grade water, and 3 µl of DNA template (molecular grade water for negative controls). PCR conditions for the Zeale primers included an initial denaturation at 95 °C for 15 min, followed by 40 cycles of 95 °C for 20 s, 55 °C for 30 s, and 72 °C for 1 min, followed by a final extension at 72 °C for 7 min Alberdi et al. 2018). PCR conditions for Gillet primers were trialled from Esnaola et al. (2018), but amplified a non-target region of DNA approximately 200 bp and 500 bp larger than the target region in S. minutus samples. The PCR conditions were altered to a two-stage PCR with higher annealing temperatures to increase specificity and decrease amplification of non-target fragments. The altered PCR conditions for Gillet primers involved an initial denaturation at 95 °C for 15 min followed by 10 cycles of 94 °C for 30 s, 49 °C for 45 s, and 72 °C for 30 s, followed by 30 cycles of 95 °C for 30 s, 47 °C for 45 s, and 72 °C for 30 s followed by a final extension of 72 °C for 10 min. The PCRs were run in triplicate and subsequently pooled, and the success of the reactions was determined by electrophoresis on a 1.2% agarose gel, which included the four negative control PCR products.
Library preparation, sequencing, and bioinformatic steps are provided in Appendix 1 of the Supplementary Material.

Taxonomic identification and range
The number of MOTUs identified and taxonomically assigned to different levels was compared between datasets using sequence clustering thresholds 95% and 98% to determine the capabilities of both primers and the overall effect of the clustering threshold. The final clustering thresholds were chosen based on the number and proportion of MOTUs that were taxonomically assigned. The clustering threshold chosen was the value with the highest proportion of MOTUs assigned to species and genus level, with reduced proportions of MOTUs restricted to order and family. In addition to this, the clustering values commonly used in the literature were also taken into account for our choice .
The taxonomic range was compared at each taxonomic level between primer sets and considered separately for both bats and shrews to establish if one primer was suited to a particular predator diet. To assess the ability of each primer to detect unique taxa, the overlap of accurately identified taxa was measured between Zeale and Gillet primers for bats and shrews at order, family, genus, and species level.

Alpha diversity
The samples represented by the combined effort of both Zeale and Gillet here have an extra advantage of increased sequencing depth (i.e., the sequencing depth of the combined primer dataset is the sum of the sequencing depth of the Zeale and Gillet datasets). Increased sequencing depth can increase alpha diversity measures, so to account for this, samples (and groups of samples) were rarefied to an equal sequencing depth to achieve a more accurate comparison of alpha diversities. Samples were rarefied to the lowest sampling depth (1110 reads) before alpha diversity measures (species richness and Shannon diversity) were calculated. To account for any stochastic results from rarefying samples, this process was repeated 100 times and the average alpha diversity scores were taken for each metric. Significant differences in alpha diversity between groups were identified using ANOVA and a Tukey post hoc test with Tukey's method for adjusting for multiple comparisons.
To determine the niche width, the samples were merged according to mammal species and primer used by summing the reads for each MOTU. The merged samples (i.e., each group) were then rarefied to the lowest read depth of said merged samples (105,501 reads). The niche width of each mammal species amplified by different primers was measured using the standardised Levin's index, Shannon diversity index (for details on measurements see Razgour et al. 2011), and Pielou's evenness index using the R packages vegan and spaa (Zhang 2016).

Beta diversity
Data were normalised by transforming sequence counts into relative read abundances per sample and a distance matrix was created for the dataset using the Bray-Curtis dissimilarity method. Data were visualised using a non-metric multidimensional scaling (NMDS) ordination plot. To determine any compositional difference in prey taxa identified between consumer species and/or primer used, permutational multivariate analysis of variance (PERMANOVA) was performed with 10,000 permutations using the adonis2 function in the vegan package in R. To be certain that any composition differences were not due to differences between homogeneity of dispersion within groups, the multivariate distances of samples to the group centroid were measured using the betadisper() function. All beta diversity estimates described here were repeated with MOTUs agglomerated to species, genus, family, and order levels.

Hierarchical clustering
Hierarchical clustering was performed to show how the chosen primer affects the grouping of samples. Clustering was performed on each sample using the hclust() function in R, with the UPGMA method. Clustering was also performed on samples grouped according to predator species and primer using the average Relative Read Abundance (RRA) values.

Random forest classifier
While different primers will amplify different taxonomic groups, it is desirable to determine which of the tested primers will amplify a greater range of taxa important to characterising the diet of that predator species. The random forest classification (RFC) is a supervised learning method that classifies samples (such as prey composition) to their source, which estimates the level of importance of each prey item to that classification and determines the accuracy of that classification (Breiman 2001). Here, RFC models were run to first determine which primer amplifies taxa that are most appropriate for classifying samples to predator species, and then again second to classify samples to the correct predator species based on the prey composition.
RFCs were performed on samples using the randomForest R package (Liaw and Wiener 2002) using 10,000 trees. The out-of-bag (OOB) error was used to measure the accuracy of classification of samples to their correct group. The most important prey taxa contributing to classification of samples were established using the 'Mean Decrease Mini' values.

Bioinformatics and MOTU filtering
The MiSeq sequencing run produced 18,527,116 sequence reads; 48.4% associated with bat samples and 49.5% associated with shrew samples. A sequence clustering threshold of 98% was used for downstream analyses. This clustering threshold identified MOTUs that had the highest species and genus-level assignment rates, with lower levels of assignment restricted to family and order level (Fig. S2). This threshold has been used by many other studies using the COI region for invertebrate detection .
The dataset utilising the sequence clustering threshold at 98% similarity yielded 9647 non-singleton MOTUs and 7698 non-singleton MOTUs for the Gillet and Zeale datasets, respectively. In the negative controls, the Gillet dataset returned 5085 reads from the Chiroptera order (< 0.13% of all Chiroptera reads) and 56 reads from Homo sapiens (~ 3.25% of all human reads). The Rhinolophidae reads in the negative control accounted for only 0.08% of all host reads across the entire dataset amplified by the Gillet primer set. These MOTUs were excluded from further analyses.
After removing MOTUs according to filtering criteria and samples with low read counts, the Gillet dataset contained 945 MOTUs across 22 R. hipposideros, 7 C. russula, and 15 S. minutus, with an average read depth of 37,555 reads per individual. The Zeale dataset contained 929 MOTUs across 23 R. hipposideros, 4 C. russula, and 11 S. minutus with an average read depth of 159,589 per individual. Rarefaction curves showed that all prey taxa were detected between 1000 and 5000 reads for each sample (Fig. 1a: inset) and the depth_cov, (q value = 1) function showed a sample coverage of > 97% for Zeale and > 98% for Gillet.

Taxonomic identification and range
Both primers detected similar numbers of MOTUs; the Gillet primers detected MOTUs that were taxonomically assigned to 240 species, 230 genera, 129 families, and 27 orders. The Zeale primers detected MOTUs that were Both primers detected a similar number of prey taxa in bats (Fig. 1). The majority of taxa detected belong to the orders Lepidoptera and Diptera, with some taxa within the Trichoptera order. Gillet also detected a small number of taxa from Hymenoptera and Araneae in the bat diet. Haplotaxida were detected by the Gillet primers, but this is likely due to environmental contamination. Although both primers detected the majority of species within Lepidoptera and Diptera in bat samples, there was a relatively even distribution of taxa detected by one and both primers (Fig. 1b).
There was a more prominent difference between primers for taxa detection in shrews. The majority of taxa identified by Zeale were within the orders Lepidoptera, Diptera, Coleoptera, and Araneae (Fig. 1a). Gillet detected taxa from a much wider order of terrestrial invertebrates (such as Haplotaxida, Hemiptera, Stylommatophora, Isopoda, and more) that are considered important in the diet of shrews (Pernetta 1976; Churchfield and Rychlik 2006). Additionally, Gillet detected substantially more species, genera, families, and orders that Zeale could not (Fig. 1b). The three orders detected by only Zeale are Sacoptiformes, Neuroptera, and Blattodea which contained only 2, 7, and 2 MOTUs, respectively.
As expected, the only primer set here to detect vertebrate DNA was the Gillet primers. Between 89 and 99% of reads in bats were of vertebrate origin and between 0.81 and 99% of reads in shrew samples were of vertebrate origin.

Composition of diet
The average relative read abundance (RRA) of prey order in R. hipposideros diet did not dramatically change between primer sets (Fig. 2b). Both primers showed that the diet mostly consisted of Diptera and Lepidoptera, but only the Gillet primers showed a noticeable proportion of the diet consisting of Hymenoptera and Trichoptera. Using a combination of both primers showed a stronger similarity to using Zeale primers alone, complementing the hierarchical clustering (Fig. 2a).
When used individually, the Zeale and Gillet primer sets suggested that a large proportion of the diet of S. minutus consisted of Lepidoptera, Diptera, and Coleoptera (Fig. 2b). Only the Gillet primers suggested the additional importance of other orders such as Araneae, Hymenoptera, Isopoda, Opilliones, and Trombidiformes as contributing to the diet of S. minutus. Using both primers to determine the diet of S. minutus demonstrated a strong influence by Gillet, complementing the hierarchical clustering (Fig. 2a), but with larger proportions of Lepidoptera, Diptera and Coleoptera.
Crocidura russula showed the largest differences in diet when analysed by Zeale or Gillet primers (Fig. 2b). Again, Zeale was restricted to Lepidoptera, Diptera, and Coleoptera. Gillet suggested the importance of terrestrial invertebrates such as Haplotaxida, Glomerida, Isopoda, Mesostigmata, and Stylommatophora. Using a combination of both primers resembled the diet suggested by Gillet alone, complementing the hierarchical clustering (Fig. 2a).

Alpha diversity
After agglomerating taxa to their highest taxonomic level, the Gillet and Zeale datasets consisted of 425 and 371 prey MOTUs, respectively, with a combined richness of 660 MOTUs (Table 1). The mean alpha diversity measures were higher in R. hipposideros compared to shrews, with S. minutus marginally higher than C. russula (Fig. 2c, d). For species richness, Tukey post hoc comparison of means showed that R. hipposideros samples amplified with both primers had an average of between 13.36 and 20.5 more MOTUs detected than all C. russula samples (all adjusted p values < 0.01), between 11.2 and 17.3 MOTUs more than all S. minutus samples (all adjusted p values < 0.01) and 8.6 more MOTUs than R. hipposideros amplified with Zeale (adjusted p value < 0.02). Rhinolophus hipposideros samples amplified with Gillet primers had an average of 14.2 and 17.5 more MOTUs than S. minutus and C. russula samples amplified with Zeale, respectively (adjusted p value < 0.001).
For Shannon diversity, the Tukey post hoc comparison of means showed significantly lower diversity (adjusted p value < 0.05) in C. russula amplified by Zeale primers compared to all bat samples, and S. minutus samples amplified with Gillet primers. Amplifying C. russula samples with both primers produced significantly lower diversity values than R. hipposideros amplified with either Gillet or both primers. Sorex minutus samples amplified with Zeale primers had significantly lower values than all R. hipposideros samples. One notable difference is the significantly lower Shannon diversity in S. minutus samples amplified with Zeale compared to Gillet (adjusted p value = 0.019).

Beta diversity
PERMANOVAs estimated a significant, but minor, difference in the composition of prey detected in R. hipposideros when using Gillet vs Zeale (R 2 = 0.08, Pr(> F) = 0.001) and Gillet vs Both (R 2 = 0.05, Pr(> F) = 0.001) but not for Zeale vs Both (R 2 = 0.006, Pr(> F) = 1). The NMDS plot (Fig. 3) showed that bats amplified with Gillet, Zeale and both primers clustered close together which also suggested that compositional differences are likely minor. There was also a minor, but significant, difference in the prey composition detected in shrews when comparing Gillet vs Zeale samples (R 2 = 0.038, Pr(> F) = 0.029) (also seen in Fig. 3). Each primer set could detect a composition difference between R. hipposideros and shrews (R 2 = 0.044-0.067, Pr(> F) < 0.01), which is a visibly clear pattern in the NMDS plot in Fig. 3.
The Tukey pairwise comparison showed no difference in the homogeneity of these tested groups, but the permutest showed a difference between S. minutus amplified with Zeale primers against all C. russula samples, which may have influenced the PERMANOVA results. The permutest also showed a difference between the homogeneity of C. russula amplified with Zeale compared to either Gillet (p < 0.01) or both primers (p < 0.001). These differences should be considered while interpreting compositional differences as homogeneity can influence PERMANOVA results.
R. hipposideros mainly predates on Diptera and Lepidoptera (Fig. 2b), which may explain why they remain a tight cluster in the NMDS plots as MOTUs are agglomerated up to order level (Fig. 3). Although shrews (particularly S. minutus) also predate on Diptera and Lepidoptera (Fig. 2b), they remain distinct from R. hipposideros when MOTUs were agglomerated to species level. As MOTUs are agglomerated to higher levels, the coordinates of some shrews migrate and cluster closer to R. hipposideros. This suggests that there are common prey orders between the three insectivore species, but bats and shrews still predate on different species, genera, and families within these common prey orders.

Random forest classifier
At the species level, RFC models were able to classify samples as originating from R. hipposideros or shrews with an accuracy of 100% using Zeale, 88.64% using Gillet, and 93.48% using both. Amongst the top 20 most important taxa (MOTUs with the highest Mean Decrease Mini values) for classifying samples to bat or shrew, the most common prey order was Diptera and Lepidoptera for each primer used. The accuracy was much lower for classifying samples to C. russula or S. minutus using Zeale (73.33%), Gillet (68.18%), or both (68.18%). The top 20 taxa for classifying species of shrew mainly consisted of taxa within Lepidoptera and Coleoptera when amplified using Zeale primers. Using Gillet, or both primers, the top 20 taxa were distributed more evenly amongst more orders such as Haplotaxida, Opiliones, Stylommatophora, and Diptera.
Bat samples could be classified to Zeale or Gillet with a high accuracy of 93.33%, while the accuracy to classify between Gillet and both primers decreased to 73.91%, and between Zeale and both decreased to 70.21%. Shrew samples could be classified between Zeale and Gillet with a lower accuracy of 83.78%. However, accuracy drastically decreased when classifying shrews between Zeale and both primers (54.05%) or between Gillet and both primers (2.27%). Full details on the 20 taxa with the highest mean Decrease Gini values can be found in Tables S2-S13.

Discussion
Here, we show that two different COI primer sets performed differently for detecting invertebrate prey composition across a broad ecological range, meaning that primer choice will have a significant impact on ecological inferences from the data generated with them. Primer comparisons for determining the diet in insectivorous mammals have previously been performed on single species or multiple species within the same ecological niche (e.g., bats; Tournayre et al. 2020). Here, we compared two widely used primer sets (Zeale and Gillet) on multiple mammals occupying different niches and demonstrated that while one primer set captured the breadth of prey for grounddwelling shrews, both primer sets were required to fully capture the diet of bats within the studied systems.
When comparing the Zeale and Gillet primer sets, the first obvious and major advantage of the Zeale primers was that there was practically no host amplification, meaning that all information retained by the Zeale primer pair represents potential prey. In contrast, the Gillet primers co-amplified large amounts of host DNA (up to 99% in some samples), which has also been observed in the previous studies Esnaola et al. 2018;Galan et al. 2018). The varied amount of host amplification between samples in this study highlights that rates of host amplification may be unpredictable to an extent. Host amplification affected S. minutus less than R. hipposideros and C. russula, and some technical and biological issues should be taken into account when analysing the difference found between species in regard to host amplification. For example, considering that the shrew samples were gut contents from dissection, 'empty' stomachs may have influenced the higher rate of host DNA amplification in the absence of prey DNA in some predators.
Apart from host and human DNA, the Gillet primers detected trace amounts of DNA from other vertebrates such as bank voles (Myodes glareolus), cattle (Bos taurus), and pig (Sus scrofa). These taxa contributed to between 2 and 16 reads in total, likely through secondary detection from invertebrate prey coming into contact with other vertebrates or their excrement before consumption. This is an unsurprising result as the previous studies have detected various species of birds, mammals, and amphibians with the Gillet primers (Biffi et al. 2017b;Esnaola et al. 2018;Galan et al. 2018). Host amplification is not desirable here, but the capability to amplify vertebrate DNA is beneficial to determine if the invasive C. russula (in Ireland) are consuming local vertebrate taxa (McDevitt et al. 2014).
This level of host amplification means that the average number of reads attributed to invertebrates in each sample was approximately three times lower in Gillet compared to Zeale. An insufficient read depth will reduce the likelihood of detecting the entire prey community, but rarefaction estimates suggested that the majority of prey were detected with a sequencing depth of between 1000 and 5000 reads (Fig. 1a). Despite the reduced read depth for prey using Gillet, more samples satisfied the filtering criteria when amplified with Gillet rather than Zeale. This is due to the Gillet primers ability to amplify a wider range of taxa, including an additional 14 orders (Fig. 2b). Many of these additional orders constitute a large portion of different shrew species' diet, such as slugs/snails (Stylommatophora), spiders (Araneae), woodlice (Isopoda), millipedes (Polydesmida), and worms (Haplotaxida) (Fig. 2b; Pernetta 1976). These results showed that after removing host sequences, Gillet primers provided more information on invertebrate prey than Zeale without using blocking primers once sufficient sequencing depth is achieved. Furthermore, blocking primers can mitigate host DNA amplification, but requires more time to design and test as they might also block amplification of target prey taxa (Piñol et al. 2015) and would be particularly challenging when investigating multiple species simultaneously as undertaken here.
The Zeale primers are extensively used and have proved very efficient in determining the diet of bats (Vesterinen et al. 2018), but this trial showed that in terrestrial insectivores, they are still mostly limited to the three orders: Coleoptera, Diptera, and Lepidoptera. They are more suitable for bats, as even the Gillet primers with their wider taxonomic range show that Diptera and Lepidoptera are the main constituents of their diet (Fig. 2b) and are in agreement with previous studies on R. hipposideros Baroja et al. 2019). Due to Zeale's high affinity to Coleoptera, Diptera, and Lepidoptera, shrew diets were biased towards these orders (Fig. 2b). In addition, the rate of shrew samples filtered out due to low read counts was much higher than with Gillet. It was evident from this study (and previous studies;Ware et al. 2020) that shrews also rely on other terrestrial invertebrate orders such as Gastropoda, Isopoda, and Haplotaxida (Fig. 2b). Zeale's inability to detect these taxa means that many shrew samples were filtered out during bioinformatic processing. Using the Gillet primers, some of the orders listed as substantial in the diet of shrews were also detected in the bat diet (i.e., Araneae and Haplotaxida) (Fig. 2b). While Aranea have previously been identified in the diet of bats (McAney and Fairley 1989), Haplotaxida have not. This unexpected detection is likely a result of environmental contamination . In each of the R. hipposideros roosts sampled in Ireland, large sheets of plastic were laid down to collect faecal samples and left exposed for a period of up to 2 weeks. Therefore, organisms coming into contact with the samples from nearby guano piles during this time may explain their detections, as Haplotaxida have been reported in bat guano elsewhere (Novak et al. 2014).
Recent studies suggest that using more than one primer will cover a wider range of taxa and give a more informative overview of the diet of these animals (Esnaola et al. 2018). This is true considering they both amplify unique taxa. For example, even though Zeale and Gillet both amplify MOTUs within the orders Diptera and Lepidoptera, they each amplify several unique MOTUs/species within each ). In addition, the RFC analysis showed that there is still a relatively high accuracy differentiating bat samples amplified with Zeale or Gillet (> 90%) and only decreased in accuracy to ~ 70% when including samples amplified with both primers. This supports both that primers are contributing relatively evenly in detecting unique components of the diet of bats. The composition of the detected diet of shrews using both primers appeared heavily influenced by the Gillet set, rather than Zeale (Figs. 2, 3), which was particularly apparent at the order level. The RFC analysis had a very low efficiency differentiating samples that had been amplified with Gillet primers or both (2.27%). In addition, when considering the same finite number of sequences that can be generated, combining the Zeale and Gillet data increased diversity of shrew prey detected compared to Zeale alone, but did not significantly increase diversity compared to using Gillet alone (Fig. 2c). This was likely due to Gillet detecting more substantial components of a shrew's diet such as slugs/snails, spiders, woodlice, millipedes, and worms. Due to the amplification biases associated with different primers, using a combination of primers will also restrict dietary studies to frequency/occurrence-based analyses. This is because different primers can preferentially amplify certain taxonomic groups. For example, the proportion of Diptera and Coleoptera may be overrepresented in the C. russula diet when primers are combined (Fig. 2b) due to Zeale's affinity to these taxa and higher sequencing depth due to non-amplification of host reads. Although many studies stick to a more conservative frequency-based interpretations of dietary data, relative read abundance (RRA) can still accurately represent the proportions of prey in an animal's diet at the population level (Deagle et al. 2019). Combining both primers used here (and in future studies) will require the sequencing depth to be normalised between the primer datasets if RRA methods are to be used, since the proportions of prey taxa become skewed in favour of Coleoptera, Diptera, and Lepidoptera (Fig. 2b).
Including both primers in a full-scale analysis will obviously increase costs and labour, so the research question to be addressed becomes the critical component when deciding which primer set(s) combination to use when investigating mammalian insectivore diet. For the species considered here, the Gillet primers amplify a wider range of taxa and may be sufficient to address ecological questions around dietary composition (e.g., spatial and temporal shifts) and competition/overlap between species (particularly for shrews). However, given the importance of bats in providing ecosystem services, and their potential role as 'natural samplers' (Siegenthaler et al. 2019) for undertaking invertebrate surveying, multiple primer sets would be required, particularly when individual pest species may need to be identified and/ or monitored ).

Supplementary Information
The online version contains supplementary material available at https:// doi. org/ 10. 1007/ s42991-021-00115-4. Acknowledgements SSB was supported by a Pathway to Excellence PhD Scholarship from the University of Salford and TGC was supported by a Waterford Institute of Technology and Environmental Protection Agency (EPA) cofund PhD Scholarship, funded under the EPA Research Programme 2014-2020. The EPA Research Programme is a Government of Ireland initiative funded by the Department of the Environment, Climate and Communications. It is administered by the EPA, which has the statutory function of co-ordinating and promoting environmental research. Thanks to the local National Park and Wildlife Service (NPWS) conservation rangers for sampling bat droppings. Fieldwork in relation to shrew sampling was supported by grants from the Vincent Wildlife Trust and a Genetics Society Heredity Fieldwork Grant awarded to SSB and ADM. Laboratory work was supported by grants from Bat Conservation Ireland and a University of Salford Internal Research Award awarded to DBO'M and ADM, and a University of Salford Pump Priming award to ADM. NGS also thanks FCT/MCTES for the financial support to CESAM (UID/AMB/50017/2019), through national funds. Finally, we are grateful to the reviewers for their valuable and constructive comments which improved the manuscript.
Author contributions ADM, SSB, TGC, and DBO'M conceived and designed the study. Bat faecal sampling was part of APH, D'ON, and DBO'M's project on non-invasive genetic monitoring of lesser horseshoe bats. Shrew sampling was part of SSB, REA, and ADM's project on dietary and microbial associations between shrew species in Ireland. SSB, TGC, and NGS performed the laboratory work and bioinformatics associated with the DNA metabarcoding. SSB and TGC analysed the data. SSB, TGC, and ADM wrote the paper, with all authors contributing to editing, discussions, and approval of the final manuscript.

Conflict of interest
The authors declare that they have no known personal relationships or competing financial interests that could have influenced the work conducted in this study.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.