Bioassessment of benthic macroinvertebrates in wetlands: a paired comparison of two standardized sampling protocols

We compared a rapid bioassessment protocol (Traveling Sweep Approach [TSA]) with a more conventional time intensive protocol (Composite Transect Approach [CTA]) to describe macroinvertebrates in wetlands in Alberta, Canada. We collected one macroinvertebrate sample using each protocol from 16 wetlands and compared abundance, catch per unit effort, and relative abundance between sample protocols. We also quantified and compared the logistics required to implement each protocol. The macroinvertebrate communities differed statistically between protocols for all three response variables; however, the differences were generally small and communities similar. The CTA protocol tended to yield higher variability in the samples, likely driven by the way these samples are collected and composited, which may introduce an unwanted source of variation when the primary monitoring objective is to assess effects of human activities over time and between sites. The CTA protocol also required significantly greater investment of time (ca. 50% greater processing time), money (ca. 1.9 times sample processing cost), and resources to execute (e.g., requirement for watercraft). Both protocols provided adequate characterization of macroinvertebrate communities in wetlands, but differences in variability and resources for deployment and processing are important considerations when choosing a sampling protocol. The rapid time-limited sweep protocol (TSA) appears to be a viable monitoring approach given that macroinvertebrate communities identified by each protocol were relatively similar but were collected using the TSA protocol at a lower cost.


Introduction
Wetlands are among the most important ecosystems worldwide because they help provide and purify water, protect communities from floods and drought, recharge and discharge groundwater, support rich biodiversity, store more carbon than any other ecosystem, and provide food and important cultural values to local communities (Millennium Ecosystem Electronic supplementary material The online version of this article (https://doi.org/10.1007/s11273-020-09708-1) contains supplementary material, which is available to authorized users.
Assessment 2005; Ramsar 2018). Unfortunately, wetland areas and types are in decline across the planet, and about 50% of inland water habitats were lost in the twentieth century (Finlayson and D'Cruz 2005). In Alberta, Canada, wetlands cover approximately 28% of the landscape (Alberta Environment and Parks 2018), and it is estimated that up to 70% of southern Alberta's wetlands have been lost or degraded (Warner and Asada 2005). It is important to monitor wetland health to assess responses to long-term trajectories associated with climate change and also to evaluate the efficacy of wetland management and the extent that management actions mitigate impacts from human activities. The ability to monitor the ecological condition of wetlands is challenging because wetlands support diverse and heterogeneous habitats and can often be difficult to access and collect the biological communities used to assess potential environmental impacts. Environmental impacts are frequently assessed by collecting and measuring organisms present in an ecosystem via a process called biomonitoring. In freshwater ecosystems, biomonitoring often includes sampling of the community composition of algae, aquatic vegetation, benthic macroinvertebrates, and/or fish as ''bioindicators.'' Bioindicators are advantageous as measures of ecological integrity because they reflect habitat conditions over time rather than a onetime snapshot of conditions provided by sampling surface water and sediments. Benthic macroinvertebrates are commonly used in monitoring studies to assess potential impacts because they meet the requirements of effective bioindicators. They are (i) taxonomically/ecologically well-understood, (ii) easily sampled, (iii) easily identified, (iv) continually exposed to natural or anthropogenic disturbances when they occur, (v) known to integrate episodic or cumulative water quality impacts in time and space and often show predictable responses to these disturbances, and (vi) known to have strong ecological relationships with other target groups such as fish (Noss 1990;Pearson 1994;Burton et al. 1999;Heino et al. 2009;Uzarski et al. 2017). In wetlands, macroinvertebrates have been previously used as bioindicators of eutrophication (Chessman et al. 2002), impacts of mining activities (Thomas and John 2006), ecological condition of constructed wetlands (Awal and Svozil 2010), and general urban and agricultural development in watersheds (Gezie et al. 2017).
Jurisdictions around the world have developed standardized sampling methodologies to collect aquatic macroinvertebrates for biomonitoring; however, these methodologies often differ in sampling equipment (e.g., kick nets, sweep nets, sampling cylinders, and artificial substrates), habitat sampled (e.g., depositional vs. erosional habitat), temporal and spatial duration of sampling, handling of samples (e.g., composite vs. replicate samples), and sample processing and identification (Carter and Resh 2001;Haase et al. 2004;Poikane et al. 2016). Methodological differences such as these can result in very different characterizations of benthic communities. For example, quantitative (e.g., sampling cylinders) methods can yield a more precise measure of invertebrate abundance than semi-quantitative methods (e.g., sweep nets) because the sample area is more precisely defined and measures of variation between samples are often lower (Carter and Resh 2001). Compositing samples can smooth the fine-scale variation in macroinvertebrate community composition present between microhabitats in a sampled reach or habitat type; however, compositing samples also eliminates the ability to assess within-site variation (Carter and Resh 2001).
Even where methodologies to collect benthic macroinvertebrates have many similarities, finer-scale differences in sampling equipment, sample location, and number of samples can confound comparisons of benthic communities characterized by the samples (Carter and Resh 2001). For example, when identical sampling devices with difference mesh sizes are used, smaller meshes can collect higher abundances and richness of invertebrates (Slack et al. 1991) and increase the relative abundance of small taxa like microcrustaceans (Carter and Resh 2001). Differences in sampling and laboratory methodologies also have significant implications for the cost of a monitoring program (Haase et al. 2004), and within a single sampling protocol, the response variable chosen can significantly impact the interpretation of data collected during monitoring (e.g., Lindig-Cisneros et al. 2003). The choice of a sampling methodology can therefore impact both the efficacy of a monitoring program and the ability of that program to generate data robust enough to address its objectives within prescribed financial limitations (e.g., Olsen et al. 1997).
In Alberta, benthic macroinvertebrates have been used as bioindicators of wetland condition for several decades primarily using a Composite Transect Approach (CTA). This is a time intensive protocol where multiple samples are taken along transects through a portion of the wetland. These samples are subsequently pooled to create a large composite sample. More recently, the Canadian Aquatic Biomonitoring Network (CABIN) has developed a rapid bioassessment protocol where a single sample from a 2 min traveling sweep through the emergent and submergent vegetation zones of a wetland (Traveling Sweep Approach; TSA) is taken to characterize the composition of the macroinvertebrate community.
Our study has three primary objectives: (1) to conduct a quantitative comparison of the abundance, catch per unit effort, and relative abundance of the macroinvertebrate communities collected by the CTA and TSA protocols; (2) assess the variability within and between the protocols, and (3) compare the logistical and cost investments required to implement the protocols. The results will provide a general case study of the potential differences between two protocols to collect benthic macroinvertebrates, add to the growing literature of standardized benthic sampling programs for wetlands, and provide information for those considering between protocols similar to those presented here.

Study sites and wetland characteristics
The study area ( Supplementary Fig. A) was located in north-central and northern Alberta. Sixteen wetlands were sampled in August 2017: 12 natural wetlands and 4 man-made ponds at various stages of reclamation. Only open-water wetlands with a minimum depth of 0.5 m were chosen for sampling.
At each wetland, several physical and chemical parameters (Table 1) were measured including depth (both maximum and average of 10 spot measurements), temperature, dissolved oxygen, conductivity, and pH (latter 4 variables measured in situ with a Hydrolab multiparameter data sonde). Open water extent was determined by manual interpretation of SPOT 7 satellite imagery most recent to the time of sampling (i.e., images captured between May and August 2017). GIS analysis was used to determine the percent human footprint in a 500 m (regional scale) buffer from the edge of surface water based on the Alberta Human Footprint Monitoring Program inventory (Alberta Environment and Parks 2016).
Wetlands ranged in average depth from 0.5 to 5.0 m and in maximum depth from 0.6 to 9.4 m. Maximum open water area was 1434 ha, but this wetland was considerably larger than other study wetlands; the remaining wetlands ranged from 0.3 to 59.9 ha. Primary land use surrounding wetlands was generally forestry cutblocks and vegetated verge, although three wetlands had no human footprint in the 500 m buffer. All wetland parameters are summarized in Table 1.

Field methods
During a single visit at each wetland, paired benthic macroinvertebrate samples were collected using the CTA and TSA wetland sampling protocols. Samples were collected by experienced field staff who deployed both sampling protocols at each wetland. Both protocols were conducted concurrently and far enough apart so that the execution of one protocol did not impact the other.

Composite transect approach: intensive survey approach
The CTA protocol is an intensive survey method designed to sample several different locations in a wetland-including nearshore and offshore areas-to assess biodiversity of wetlands and track long-term regional changes through time. Wetlands were sampled using the CTA protocol as follows. Macroinvertebrate samples were collected along a triplicate transect design, where three fixed points were visited from the wetland margin (0 m, 25 m, and 50 m) along each transect. Transects were spaced at least 50 m apart, each starting at the wetland margin (i.e., emergent vegetation zone) and extending toward the centre of the wetland (i.e., open water zone). Upon reaching a sampling point, the sampler was instructed to survey the area within a 10 m radius of the point and choose a specific location to sweep that contained at least 50% cover of rooted aquatic macrophytes. If no dense macrophyte patches were found within the surveyed area, the sampler collected the sweep from the area of highest macrophyte cover available. While working from a kayak, field staff dipped a 500 lm D-net into the water and pulled it up through the water column three times at each sampling location. If a sampling location was less than 1 m deep, the sweeps began just above the wetland substrate and proceeded up through the water column. If the sampling location was greater than 1 m deep, the sweeps only included the top 1 m of the water column.
In addition to nine samples collected along three transects, a 10th sample was taken at the deepest point of the wetland. Again, the sampler was instructed to sweep within an area of dense macrophyte coverage. The distances between grid points and transects were reduced when necessary in very small wetlands. Material collected from the ten sampling locations was then combined into one composite sample per wetland. A detailed description of the protocol is available at: https://www.abmi.ca/home/publications/ 1-50/45.html. We also recorded the amount of time to complete field collections (± 1 min) for each protocol by quantifying major field tasks.

Traveling sweep approach: rapid time-delineated approach
The TSA protocol is a rapid assessment method that samples only the shallow littoral zones of wetlands and is designed to provide a rapid assessment of macroinvertebrates communities that can be compared among wetlands. Samples were collected using the TSA protocol as follows. The sampler chose a location along the shoreline deemed most representative of the wetland (i.e., vegetation type and cover) and completed a single, 2-min traveling sweep with a 400 lm triangular benthic kick net. While wading through the emergent and submerged aquatic vegetation zones in the wetland margin, the sampler moved forward in a zig-zag pattern and worked the net from the wetland bottom to near the surface through the vegetation. The sampler took care not to press the net into the substrate and to maintain forward movement to keep flow moving through the net. The protocol is modified from the wadeable streams guide (Environment Canada 2008) and is based on per unit effort (time) to standardize among sites and results in one sample for each wetland. Although area swept is not recorded for this time-standardized protocol, a person After collection, benthic macroinvertebrate samples were preserved in the field with either 10% buffered formalin (CTA) or 95% ethanol (TSA) and stored in the dark until being sent for laboratory processing. For both protocols, large pieces of vegetation were rinsed within the net and returned to the wetland; small pieces and highly-dissected pieces of vegetation were retained in the sample for later processing in the lab. A bulk wet weight (kg ± 1 g) was also recorded for each sample (consisting of sample jar, sample material, and preservative) before it was shipped for laboratory processing.

Laboratory methods
All samples for both protocols were sent to a single certified contractor for processing, identification and counts following Armellin et al. (2018). The contractor held relevant genus and family-level certifications from The Society of Freshwater Science.
Samples were cleaned of excess vegetation and subsampled using a Marchant box. A minimum of 5% of each sample was sorted, unless 2500 individuals were enumerated and identified before reaching a minimum of 5% (one instance; CTA sample from HWC01). Identification and counts of microcrustaceans (copepods, cladocerans, or ostracods), nematodes, Turbellaria, and tardigrades were conducted; however, only results for macroinvertebrates are presented here, as they are the targets for both monitoring protocols. Macroinvertebrates were identified to genus or species where possible, but many juvenile and damaged individuals often prevented identification beyond the family level. Total counts were provided for each taxon, and these counts were summed to generate a family-level abundance for each family in each sample. Cordillera Consulting performed quality assurance and quality control (QA/QC) procedures on samples consistent with standardized protocol (Environment Canada 2014).
All samples passed QA/QC standards for taxonomic identification. The percent identification error on QC samples ranged from 0.00 to 2.25%. Most errors occurred at the genus or species level within a family and therefore did not impact the family-level analyses presented here. Nine samples were resampled to quantify sorting efficiency. Of these, seven passed the 95% standard for the resorting recovery rate. One sample failed at 94% recovered, and a single sample failed at 77% recovered. It was determined that this sample failed because the person conducting the subsampling overlooked some burrowing taxa present in macrophyte stems (primarily oligochaetes and chironomids); however, adding the relatively small number of missing taxa from these two abundant groups had no effect on the results of exploratory analyses with and without the missed individuals.
In the laboratory, technicians recorded the time spent processing each sample, including subsampling and macroinvertebrate identification. Information on the amount of time to collect samples in the field, total number and cost of jars, total weight of samples, and cost of processing and preserving samples was also recorded for comparison purposes.

Statistical methods
Most analyses occurred at the family-level, with the exception of the genus-level accumulation curves for each protocol and taxa richness (sum of unique taxa) Shannon-Wiener H' (Shannon 1948) and Simpson's Index (D) (Simpson 1949), which was analyzed at the laboratory's lowest feasible taxonomic level (LFTL), typically genus or species. Families with at least three occurrences across all samples were included in analyses. Families unique to a protocol were noted (Supplementary Information, Table A) but not included further in analysis, including multivariate analysis; these families were uncommon (occurring in no more than three of 32 total samples), and it is common practice to remove rare taxa prior to conducting ordinations (Peck 2010; see also Boersma et al. 2016;Gleason and Rooney 2017). Some juvenile/damaged individuals were only identified to the order (or higher) level; these taxa were included in analyses of total macroinvertebrate counts but were excluded from analyses at the family level.
We chose to conduct the majority of our analyses at the family level, in part, to include early instars and damaged individuals in our analyses that could be confidently identified to family but not to genus. Although analyses at finer taxonomic levels can sometimes reveal patterns not seen at the family level (Lenat and Resh 2001), analyses at the family level in biomonitoring studies is common where limitations in time and expense may preclude identification and analyses at finer taxonomic levels (e.g., Bailey et al. 2001;Chessman et al. 2002;Gleason and Rooney 2017). As well, multivariate techniques used to investigate impacts-such as those used here-often show strong correlation between family-level results and results at lower taxonomic levels (e.g., Lenat and Resh 2001;Buss et al. 2015;Culp et al. 2018).
Characterizing Macroinvertebrate Communities between CTA and TSA protocols Our first objective was to characterize the macroinvertebrate communities collected by the CTA and TSA protocols. The total count of individuals in each family was used to create three response variables for analyses: abundance (total family count), catch per unit effort (CPUE; ind kg -1 ; total family count/wet weight in kg of each bulk sample), and relative abundance (%; total family abundance/sum of abundance of all families). We compared the two protocols using all three response variables because there was no a priori expectation that they would co-vary in a similar fashion. Catch per unit effort and relative abundance were intended to help standardize for differences in sample weight and raw macroinvertebrate abundance, respectively, between the two protocols. To determine whether protocols collect macroinvertebrate genera at a similar rate, we also plotted bootstrapped (n = 999) genus accumulation curves for the abundance dataset for both protocols.
We used multivariate and univariate techniques to compare differences in the macroinvertebrate communities. First, we summarized and compared the abundance, CPUE, and relative abundance of macroinvertebrate communities using Nonmetric Multidimensional Scaling (NMS) ordination based on Bray-Curtis distances in the programs PC-ORD (Version 7.03; McCune and Mefford 2016) and PRIMER-E (Version 7.0.13; Clarke and Gorley 2015). When NMS results recommended a 3-axis solution, we chose to graph the two axes that explained the greatest variation in the dataset. Vectors were plotted on ordination biplots for taxa with r 2 [ 0.25 with the exception of CPUE, for which r 2 [ 0.30 was used to reduce the large number of taxa appearing on the plot at r 2 [ 0.25.
Second, we performed multivariate statistical tests of differences between the macroinvertebrate communities collected by the two sampling protocols. Because wetland samples were paired, we used the Blocked Multi-Response Permutation Procedures (MRBP; McCune and Grace 2002) in PC-ORD to test for paired differences between the protocols. When MRBP identified significant paired differences between CTA and TSA communities, we examined vectors for individual taxa and also used the SIMPER procedure in PRIMER-E to determine which families were driving the differences. The SIMPER procedure identifies taxa that contribute most to the Bray-Curtis dissimilarities observed between the protocols (Clarke and Gorley 2015). In addition to MRBP tests, we used the RELATE procedure in PRIMER-E, which correlates the NMDS plots for each protocol and provides an additional test of the similarity between macroinvertebrate communities.
Third, we applied the Wilcoxon paired sample test (Zar 1999) in SPSS (IBM SPSS Statistics for Windows, Version 22.0. Armonk, NY) to compare familylevel abundance, CPUE, and relative abundance of macroinvertebrates between the CTA and TSA protocols. The non-parametric Wilcoxon test was used, as the distribution of the paired differences between families rarely met the assumption of normality. A paired t-test was used to compare taxa richness, Shannon-Wiener H' and Simpson's Index as these variables met the assumptions of the test. Results were considered significant when p \ 0.05. We did not correct the alpha value defining significant results for multiple comparisons in order to investigate all potential differences that may exist between protocols.

Wetland and procedural factors driving variation in macroinvertebrate communities
The second objective of this study was to compare the degree of variability of samples collected within and between each protocol. Because we collected a single sample of each protocol per wetland, the goal was to assess and compare overall within-protocol variability and not within-wetland variability. Therefore, when assessing variability, each wetland was considered as a replicate within a protocol. A protocol or response variable with high statistical variation among samples (i.e., replicate wetlands) may reduce the ability to detect statistical differences among treatments (e.g., control and impact sites) when wetlands are used as replicates in a monitoring design or when trends in wetland condition are followed through time.
First, we used the SIMPER and PERMDISP procedures in PRIMER-E (Anderson et al. 2008) to calculate the average Bray-Curtis similarities and multivariate dispersion, respectively, within and between protocols for all three response variables (Clarke and Gorley 2015). Second, we took several approaches to investigate how differences between protocols in bulk sample weight and depth of sampling locations-two of the most apparent differences between protocols-could impact results. For bulk sample weight, samples taken with the CTA protocol varied considerably among wetlands in the total mass of each sample collected, and similar large variation in mass was not seen with the TSA protocol. To quantify the potential impacts of this variation, we used linear regression to investigate the relationship between sample weight and total macroinvertebrate abundance. For depth of sampling locations, the CTA protocol prescribes a slightly different sweep technique (see description above) depending on whether a location on the sampling grid is less than 1 m deep (i.e., ''shallow'') or greater than 1 m deep (i.e., ''deep''). The depth of sampling locations is not explicitly controlled within the CTA protocol, however, and the number of shallow versus deep sampling locations can vary considerably depending on the position of the sampling grid over the bathymetry of a wetland. Similar variability is not seen in the TSA, as the depth of a sampling location is restricted to wadeable areas. To assess whether wetland bathymetry affected the variability of macroinvertebrate communities obtained by the CTA protocol, we grouped wetlands into one of three depth categories based on the number of shallow sampling locations (\ 1 m) versus deep sampling locations ([ 1 m) in each wetland: ''Deep Wetlands'' = 1-3 shallow locations in a wetland, remainder deep; ''Mixed Depth Wetlands'' = 4-5 shallow locations in a wetland, remainder deep; and ''Shallow Wetlands'' = 6-10 shallow locations in a wetland, remainder deep. Wetlands were relatively evenly distributed among groups: Deep Wetlands (n = 6), Mixed Depth Wetlands (n = 5), Shallow Wetlands (n = 5). Although each TSA sample comprises a single sweep taken at one location and not a composite of multiple sweeps at locations with different depths, we also tested the TSA dataset with the same wetland groups as the CTA dataset. If the variation in sampling depths in the CTA dataset alone impacts the structure or variability of macroinvertebrate communities, a similar pattern should not be seen in the TSA data from the same set of wetlands. Analysis of Similarity (ANOSIM) in PRIMER-E was used to test for differences in macroinvertebrate communities among the three groups.
We also used the Bootstrap Average procedure in PRIMER-E to compare macroinvertebrate communities among the three depth groups. The Bootstrap Average procedure resamples each dataset and uses partial metric multidimensional scaling to provide a 95% confidence interval around the group mean of a category in multivariate space based on a prescribed number of bootstrap iterations; 100 bootstrap iterations were used for these analyses. Finally, we used the PERMDISP procedure to test the homogeneity of multivariate variance between the groups for both CTA and TSA datasets.

Time and budgetary differences
Our final objective was to compare the differences in time and resources required between the two protocols. We used paired t-tests to compare the logistical aspects of each protocol (e.g., time required to collect and process samples), as these data generally met the assumptions of the test.

Biodiversity and unique macroinvertebrate families and genera
In total, 51 families of macroinvertebrates were identified from all samples across both protocols. Lowest Feasible Taxonomic Level (LFTL) richness (t 15 = 0.26, p = 0.80), Shannon-Wiener H' (t 15 = -0.57, p = 0.58) and Simpson's Index (t 15 = 1.19, p = 0.25) did not differ between samples collected using the TSA or CTA protocols. Of the 51 families, six were unique to TSA samples and two were unique to CTA; both protocols also collected 20 unique macroinvertebrate genera (Supplementary Table A). The bootstrapped genus accumulation curves were nearly identical for both protocols ( Supplementary  Fig. B).
Macroinvertebrate abundance, CPUE, and relative abundance For abundance, NMS ordination recommended a 2-axis solution with a final stress of 8.96 (Fig. 1a, b, p = 0.004). The two axes explained 81.5% of the variation in the dataset. There was a significant difference in macroinvertebrate abundance between communities characterized by CTA and TSA protocols (Blocked MRBP: A = 0.060, p = 0.012), but the A-value indicating effect size was small. SIMPER analyses identified Chironomidae, Caenidae, Naididae, Hyalellidae, and Planorbidae (in order of decreasing importance) as the primary families responsible for the dissimilarity between protocols. Despite the paired differences, communities in both protocols were significantly correlated (RELATE: R = 0.448, p = 0.002). Thirty-one of 41 families/groups (75.6%) had no significant differences between protocols (Table 2). Of the 10 families/groups that were different, eight families had higher abundance in CTA samples and two had higher abundance in TSA samples (Fig. 2a).
For CPUE, NMS ordination recommended a 2-axis solution with a final stress of 9.72 (Fig. 1, c, d, p = 0.0040). The two axes explained 85.1% of the variation in the dataset. There was a significant difference in macroinvertebrate CPUE between communities characterized by CTA and TSA protocols (Blocked MRBP A = 0.14, p = 0.0002), but the A-value indicating effect size was small. Unlike for abundance, these differences were driven primarily by higher CPUE of families associated with TSA samples (Fig. 1c). SIMPER analyses identified Chironomidae, Caenidae, Hyalellidae, Planorbidae, Naididae, and Coenagrionidae (in order of decreasing importance) as the primary families responsible for the dissimilarity between protocols. Despite the paired differences, communities in both protocols were significantly correlated (RELATE: R = 0.303, p = 0.045). Twenty-six of 41 families/groups (63.4%) had no significant differences between protocols (Table 2).
All 15 families/groups with significant differences had higher CPUE in TSA samples (Fig. 2b).
For relative abundance, NMS ordination recommended a 3-axis solution with a final stress of 9.71 (Fig. 1e, f, p = 0.0040). Axes 1 and 2 together explained 81.2% of the variation in the dataset and Axis 3 explained an addition 11% (cumulative variance explained = 92.2%). There was a significant difference in relative abundances of macroinvertebrates between communities characterized by the CTA and TSA samples (Blocked MRBP: A = 0.054, p = 0.013), but the A-value indicating effect size was small. SIMPER analyses identified Caenidae, Chironomidae, Naididae, Hyalellidae, Planorbidae, Chaoboridae, Baetidae, and Coenagrionidae (in order of decreasing importance) as the primary families responsible for the dissimilarity between protocols. Despite the paired differences, the communities in both protocols were significantly correlated (RELATE: R = 0.271, p = 0.036). Thirty-three of 40 families (82.5%) had no significant differences between protocols (Table 2). Of the seven families/groups with significant differences, two had higher relative abundance in CTA samples and five had higher relative abundance in TSA samples (Fig. 2c).
Variation in macroinvertebrate communities due to sampling protocol We used Bray-Curtis similarities to investigate how similar the macroinvertebrate communities collected by both protocols are for all three response variables. Relative abundance yielded the most similar samples, both between and among protocols for all three response variables (Supplementary Table B). Multivariate dispersion was not different between CTA and TSA samples for all three response variables: abundance (PERMDISP; F 1,30 = 0.12, p = 0.80), CPUE (PERMDISP; F 1,30 = 1.08, p = 0.39), or relative abundance (PERMDISP; F 1,30 = 0.58, p = 0.50).
The CTA protocol resulted in the collection of variable (and often high) sample volumes that ranged from 1.1 to 12.2 kg wet weight per sample and required storage in 1-10 sample jars per wetland. Macroinvertebrate abundance was positively related to sample weight (F 1,14 = 12.08, p = 0.0037) and sample weight explained approximately 46% (r 2 = 0.46) of the variation in total macroinvertebrate abundance in CTA samples. Wet mass of samples collected using the TSA protocol ranged from 0.6 to 1.8 kg and required storage in 1-2 sample jars. Total macroinvertebrate abundance was unrelated to weight (r 2 = 0.06, F 1,14 = 0.93, p = 0.35) for TSA samples.
Macroinvertebrate communities were significantly different between depth groups for all three response variables for CTA samples (ANOSIM: R [ 0.19, p \ 0.026) and for one variable (relative abundance) for TSA samples (ANOSIM: R = 0.294, p = 0.006).
For CTA, multivariate dispersion was significantly different among depth groups for both CPUE (PERMDISP: F 2,13 = 8.42, p = 0.026) and relative abundance (PERMDISP: F 2,13 = 4.54, p = 0.046) but was not different for abundance (PERMDISP: F 2,13 = 4.13, p = 0.16). For TSA, multivariate dispersion was not different among the groups for any of the three response variables (PERMDISP: F 2,13 \ 1.29, p [ 0.49 for all three response variables). Patterns observed for the bootstrapped macroinvertebrate communities based on depth grouping were different between protocols, with a larger overlap between groups generally seen in TSA versus CTA samples; however, depth groups generally showed some separation for all response variables for both protocols groups (Fig. 3).

Time and budgetary differences
Samples collected with the CTA protocol were significantly heavier, required significantly more sample jars, and took significantly longer to collect than did TSA samples (Table 3). In the laboratory, CTA samples were also more expensive to process and took significantly longer to process than did TSA samples; however, total identification time did not differ between protocols (Table 3). Additionally, because the CTA protocol collected more sample volume than the TSA protocol, it required relatively more time to prepare samples (e.g., label and secure jars for shipment), preserve samples, and pack samples for shipment, although the specific time required for these activities was not recorded.

Discussion
Although there were significant differences in macroinvertebrate communities between protocols for abundance, CPUE, and relative abundance, the effect sizes were relatively small; an A-value (signifying effect size for the MRBP) [ 0.3 is considered relatively high (McCune and Grace 2002) whereas our highest was A = 0.14. Indeed, there was a large degree of overlap between NMS polygons enclosing the communities for each protocol across all three response variables, and NMS plots were significantly correlated between protocols for all three variables. Additionally, LFTL richness, Shannon-Weiner H', and Simpson's Index were not different between protocols. Other studies have also found minimal differences in macroinvertebrate communities when comparing sampling protocols within a lake (Garcia-Criado and Trigal 2005), a series of wadeable streams (Brua et al. 2011), or among different laboratory processing procedures (Haase et al. 2004). For example, Garcia-Criado and Trigal (2005) found no difference in richness or relative abundance of macroinvertebrate taxa when using both a Kornijów apparatus and a sweep net in macrophytes in an Iberian pond. Brua et al. (2011) found that kick-and u-nets yielded Bray-Curtis similarities of greater than 75% for benthic macroinvertebrate communities collected from wadeable streams in Canada and no statistical differences in standard bioassessment metrics between methods; however kick nets did collect more Chironomidae. Of the 122 total family/group-level comparisons in our study, 90 (73.7%) were not significantly different between protocols and 21 of 41 (51.2%) families/groups tested were not significantly different between protocols for any of the three variables tested. Only 3 of 41 families/groups (7.3%) had significant differences for all response variables tested: Hydrachnidae, Limnephilidae, and Total Count. Thus, despite occasional differences, the large majority of families showed no significant difference between protocols for at least one response variable, and communities were significantly correlated between protocols for all three response variables.
The CTA and TSA protocols differed in three primary areas: total weight of samples, where samples are collected in a wetland, and depths of sampling locations. Variability in total macroinvertebrate abundance of the CTA protocol was partially driven by the total wet weight of each bulk sample (each CTA sample weighed approximately 1.4-13.6 times more than its TSA pair). This difference in sample weight between wetlands-determined primarily by the volume of collected vegetation-has the potential to be a source of uncontrolled variation among wetland samples, as the CTA protocol does not currently have a process to standardize for the large variation in material collected from each wetland. Indeed, despite using the same subsampling procedure for both protocols, the sample volume was significantly related to invertebrate abundance for the CTA but not the TSA. As well, large volume of debris collected in the sweep net can also retain organisms smaller than the mesh size and increase the variability of organisms collected in this smaller size fraction (e.g., Carter and Resh 2001).
Sampling multiple locations within wetlands is advocated as a means to obtain a comprehensive inventory of invertebrate diversity (e.g., Halse et al. 2002;Gleason et al. 2018) because the composition of invertebrates can differ significantly between microhabitats in a wetland such as the benthic substrates and macrophyte beds (e.g., Turner and Trexler 1997;Meyer et al. 2013). Indeed, Meyer et al. (2013) found higher estimates of taxa richness but lower estimates of invertebrate (both micro-and nonmicrocrustacean) density and biomass in D-net samples from wetlands when compared to samples that combined benthic cores, vegetation clipping, and water column sampling. Other studies have found similar differences between habitat-specific sampling devices but have advocated sweep netting of macrophytes as the best method to discriminate invertebrate communities between wetlands because of the higher overall richness of taxa collected (Cheal et al. 1993). However, even within macrophyte beds, invertebrate abundance can vary from nearshore to offshore locations, with overall abundance and/or diversity often highest closest to shore (Cardinale et al. 1998;de Szalay and Resh 2000;Sychra et al. 2010). In a previous study, for example, gastropods were also strongly associated with nearshore areas (Sychra et al. 2010), and in Alberta prairie pothole lakes, limnephilid caddisflies were also most associated with emergent, vegetated habitat (Gleason et al. 2018). Although Gleason et al. (2018) did not find significant differences in the abundance of invertebrates in emergent and open water zones of Alberta wetlands, they did find significant differences in richness, evenness, and taxonomic composition between open water portions of macrophyte beds relative to samples taken from the macrophytes themselves, with total richness higher in macrophytes (see also Beckett et al. 1992). Thus, TSA's focus on sampling only nearshore areas of macrophyte beds may be responsible for the higher overall CPUE of macroinvertebrates in TSA samples and the higher relative abundance of some taxa strongly associated with macrophytes such as Limnephilidae, Physidae, and Planorbidae. Unlike some other studies that compared very different sampling devices-e.g., benthic core versus a D-net (Meyer et al. 2013), benthic core versus plankton net versus tow net (Cheal et al. 1993), and benthic core versus multiple trap types versus multiple net types versus artificial substrates (Turner and Trexler 1997), our comparison used two similar nets and focused on one core habitat type (macrophyte beds), although the location of the macrophyte beds sampled did differ between protocols. Thus, we saw less apparent ''nesting'' of communities (e.g., a clear signal of a benthic community nested within a larger wetland community; Cheal et al. 1993) within wetlands than if either protocol had also used additional devices, such as a benthic corer. Indeed, differences between sampling protocols or methodologies appear most likely when different habitat types are sampled with different devices (e.g., Cheal et al. 1993;Meyer et al. 2013) than when similar habitats are assessed and similar devices are used (e.g., Garcia-Criado and Trigal 2005;Brua et al. 2011). Interestingly, both protocols in our study collected the same number of unique genera, which may be due to the relative tradeoffs of intensively sampling the emergent-submergent transition zone (TSA protocol) versus taking less intensive samples from a variety of areas across a wetland (CTA protocol).
A wetland's bathymetry (i.e., in the number of shallow (\ 1 m) and deep ([ 1 m) sampling locations in the grid) also appeared to drive variation in the CTA protocol. Although neither protocol samples benthic sediments explicitly, both do recommend positioning the net directly above the sediments (for the CTA, only when sites are \ 1 m); thus, the action of kicking or sweeping does disturb and collect some benthic organisms for all TSA samples and for some CTA samples (those from locations \ 1 m deep). The composition of organisms in benthic substrates can differ from those in the water column and vegetation (e.g., Turner and Trexler 1997), and macroinvertebrate communities can vary with depth of sampling location within macrophyte beds (Sychra et al. 2010). We found that wetland depth (based on the number of shallow vs deep sampling locations) had a significant effect on macroinvertebrate communities for all three response variables for the CTA dataset but only 1 of 3 variables for the TSA dataset; as well, multivariate dispersion was significantly different among groups for two of three response variables for CTA but was not different for any variables for the TSA. However, it is important to note that we did not find an overall difference in multivariate dispersion between the larger CTA and TSA datasets. This variability was only detectable when the wetlands were grouped by depth of sampling locations; thus, the impacts of this variability could itself vary depending on the bathymetries of specific wetlands included in a study. For example, if a study included only wetlands \ 1 m deep, one might expect a smaller impact of this difference in sampling depths between protocols.
The two families that were significantly different between protocols (Limnephilidae and Hydrachnidae) for all three response variables do suggest some differences in how the protocols sample macroinvertebrate communities. We suspect that the higher abundance of Limnephilidae (57.2% difference of means between protocols) with the TSA protocol was due to this protocol primarily sampling shallow nearshore areas and the traveling sweep protocol more effectively dislodging the caddisflies from submerged macrophytes compared to the three upwards sweeps with D-net. Hydrachnidae mites were collected in higher abundance in TSA samples (80% difference of means between protocols) which may reflect the smaller mesh used in this protocol. Smaller mesh nets do collect relatively higher abundances of some smaller-bodied macroinvertebrate taxa such as mites and hydroptilid caddisflies (Rosenberg et al. 1999); however, other mite families were not collected in higher abundance from the TSA samples. Debris buildup in nets can minimize potential effects of different mesh sizes, and we suspect that macrophytes accumulating in sample nets reduced potential impacts of the 100 lm difference between meshes (Carter and Resh 2001). It is also important to note that Naididae and Hydridae were collected at higher raw and relative abundances in the CTA protocol (range from 76 to 178% percent difference of means). Formalin tends to preserve soft-bodied organisms better than ethanol and may be partially responsible for the higher relative abundances of these soft-bodied taxa (e.g., Souza and Barros 2017) in CTA samples.
The additional time and cost for the CTA protocol is related primarily to the additional sweep locations (10 for CTA vs. 1 for TSA) and larger volume of sample collected by the protocol. Each CTA sample cost approximately 1.9 9 as much as each TSA sample ($850 vs. $450, respectively) for analyses and required approximately 50% more time for processing and identification relative to TSA samples (15.6 h vs. 10.0 h, respectively). This difference was due almost entirely to the additional time required to subsample each CTA sample. These differences in time and cost could be substantial depending on the number of samples in a monitoring program. For a study such as ours with 32 samples, the differences in protocols would be approximately $12,800 for processing cost and approximately 179.2 h of processing time. For a larger program with 120 samples, this difference would be approximately $48,000 extra for processing costs and require approximately 672 h of additional processing time relative to the TSA protocol. This is significant, as cost savings in one aspect of a program (e.g., data collection) could be applied to another aspect such as data analysis and reporting (Caughlan and Oakley 2001). The additional time, sample weight, and jar count for CTA samples also warrants careful consideration for field work logistics and associated costs. For example, the additional time, weight, and space required for CTA, compounded with helicopter costs, could reduce the number of sites that could be sampled in a day or in total for a program.
We documented some differences in macroinvertebrate families and communities between the CTA and TSA protocols. For example, the CTA community generally collected larger raw abundance of organisms, and the TSA protocol generally had a higher catch-per-unit-effort. However, the large majority of our comparisons of three response variables showed no significant differences between protocols at the family level (abundance; 76%, CPUE; 63%, relative abundance; 82%) and results from both protocols were significantly correlated at the community-level for all three response variables. Because the overall differences between macroinvertebrate communities were small, both protocols provided a very similar picture of the wetland macroinvertebrate communities. Our study provided useful insights on the comparability of two sample protocols, as comparing protocols with a pilot project like the one completed here is a vital, and often overlooked, step when designing a monitoring program. Indeed, optimizing a sampling design, choosing response variables, and ensuring that sampling protocols can collect data to address program objectives is essential for the scientific defensibility and long-term success of a monitoring program (Caughlan and Oakley 2001). As well, it may be possible to pool or combine results from two (or more) different protocols when the macroinvertebrate communities they characterize are very similar (e.g., Brua et al. 2011). Ultimately, the choice of a sampling protocol should be informed by the objectives of a research or monitoring program (e.g., Meyer et al. 2013). In our comparison, CTA and TSA protocols collected relatively similar macroinvertebrate communities from the wetlands sampled, but the TSA protocol represents a significant savings in time and resources and may result in samples with lower variability.