Introduction

The negative effects of invasive species have touched nearly every corner of the globe. Newly introduced species may alter every trophic level of the native community by initiating a trophic cascade both through direct (e.g., predation, competition) and indirect (e.g., habitat alteration) means (Moyle and Light 1996; Crooks 2002; Strayer 2010; Gallardo et al. 2016). Following a review of the species protected under the U.S. Endangered Species Act, Dueñas et al. (2018) found 73% of those with available information had been negatively impacted by invasive species. Additionally, a global meta-analysis of specifically aquatic invasive species (AIS) by Gallardo et al. (2016) revealed significant decreases in the abundance of established macrophytes, zooplankton, and fish following AIS introduction and establishment regardless of the invading species’ trophic position. The annual economic cost attributable to invasive species in the United States has approximately averaged over $22.4 billion between 1970 and 2020 (Diagne et al. 2021; Leroy et al. 2021). In the European Union, damages caused by invasive species and efforts to control them cost approximately €20 billion per year (Kettunen et al. 2009; Tollington et al. 2017). According to the United States Aquatic Nuisance Species Task Force, the annual expenditures by its 13 federal member agencies to control AIS in fiscal years 2012 through 2014 averaged approximately $260 million (GAO 2015). The potential for further economic and environmental harm necessitates efforts to prevent the introduction and establishment of future invasive species.

Legislation has largely been successful in limiting invasive species introductions in the United States in the last several decades (Sturtevant et al. 2019). Although legislation may reduce risk, it cannot eliminate new species introductions altogether (Hulme 2006; Epanchin-Niell et al. 2012; Havel et al. 2015). It is vital that novel AIS be detected while they are still rare and geographically restricted, thus maximizing the likelihood that control or eradication efforts will be successful with minimal costs (Myers et al. 2000; Hulme 2006; Mehta et al. 2007; Trebitz et al. 2009; Hoffman et al. 2016; Lodge et al. 2016). In the Laurentian Great Lakes, the 2012 renewal of the Great Lakes Water Quality Agreement prioritized preventing the establishment and spread of novel AIS, leading to the establishment of the U.S. Fish and Wildlife Service (USFWS) Great Lakes AIS Early Detection and Monitoring (EDM) Program.

Efforts to detect a novel non-native species early in its invasion fall into two categories: species-specific sampling and broad-spectrum monitoring. Surveys designed for species-specific sampling are characterized by protocols that maximize the detection probability of a small number of species (Hauser and McCarthy 2009). Conversely, broad-spectrum monitoring is characterized by sampling designed to detect introduced species with a wide range of potential life history strategies or habitat preferences (Trebitz et al. 2017). This type of monitoring is best utilized in systems with either no single direct invasion threat (e.g., a river pool with a single lock and dam system separating an invasion front from uninvaded habitat) or a direct invasion threat by several species simultaneously. In the Great Lakes, Davidson et al. (2021) identified 40 fish species as high priority for early detection due to their current lack of establishment in the basin, existing pathways for introduction, predicted ability to establish, and predicted detrimental impacts. The habitat preferences of these species span all available habitats in the Great Lakes, making species-specific sampling for all of them at once implausible without substantial resources (Trebitz et al. 2009). As a result, the USFWS EDM Program utilizes a broad-spectrum monitoring approach to AIS early detection in the Great Lakes.

The probability of capturing a new species using broad-spectrum monitoring is improved by sampling across all possible habitat types in a location. Thus, the capture of currently established species of all habitat types and niches would help ensure systems are monitored effectively. The substantial amount of effort required to detect a large proportion (e.g., ≥ 95%; Chao et al. 2009) of the established species pool in a complex environment coupled with finite resources necessitates an adaptive management framework to increase program efficiency (Jones et al. 2015; Hoffman et al. 2016; Trebitz et al. 2017). Hoffman et al. (2016) outlined an adaptive management cycle for an EDM program involving “annual surveillance, evaluation, and improvement”, using annually updated data to optimize site selection and the allocation of sampling gears to increase efficiency. This adaptive management cycle involved both targeting sites where high species richness and rare species had been found in past sampling efforts as well as optimizing the ratio of gears used. This allowed Hoffman et al. (2016) to detect higher overall species richness within three Great Lakes ports. Furthermore, given a new non-native species cannot definitively be proven absent, the detection of known rare species may also be leveraged as an indication for the likelihood a program will detect a new non-native species while it is still in low abundance and/or spatially limited (Hoffman et al. 2016; Trebitz et al. 2017). Trebitz et al. 2009 demonstrated an increase in the proportion of rare species detected in the St. Louis River/Duluth-Superior Harbor system by targeting their sites and gear allocation to exploit the patchiness in the distributions of established rare and non-native fish species they had observed in previous sampling efforts. The increases in overall species richness observed by Hoffman et al. (2016) and rare and non-native species detection rates observed by Trebitz et al. (2009) likely signify a higher likelihood for detection of a new non-native species should one be introduced. Thus, using existing survey data at a site to inform and implement more targeted sampling efforts can help improve sampling efficiency for broad spectrum non-native species early detection efforts.

In this study, we outline the changes in site and gear selection protocols that allowed for improved detection efficiency of established fish species in Lake Erie as part of the USFWS Alpena Fish and Wildlife Conservation Office (FWCO) EDM Program. Lake Erie (USA) is one of the largest lakes in the world, with a surface area of over 25,000 km2, varied habitat complexity, and a history of species invasions. The beginning of the program utilized a depth-stratified random sampling design to develop baseline species detection rates. This sampling design, hereafter referred to as “random sampling”, was implemented between two and six years at four Lake Erie locations: Detroit River (Michigan/Ontario, 5 years), Maumee Bay (Michigan/Ohio, 6 years), Sandusky Bay (Ohio, 6 years), and Cleveland (Ohio, 2 years). Evaluation of the cumulative results over this time period led to changes in the gear allocation and site selection protocols at these locations with the goal of increasing detection efficiencies. The new sampling design, hereafter referred to as “selective sampling”, was tailored to increase the detection rate of established rare species (Trebitz et al. 2017). The ratio of gears in the selective sampling design shifted toward those that were shown to capture a greater proportion of rare or occasional species [as defined by Silver et al. (2017) and Harris et al. (2018)], and site selection was targeted toward areas where rare or occasional species had previously been detected as well as new habitats that had not previously been sampled. The objective of this study was to determine if a shift in sampling design from “random” to “selective” improved EDM program efficiency. We did this by examining the effects of sampling design on (a) species accumulation curves, (b) the percentage of species detected out of the entire predicted species pool, (c) total species richness, and (d) detection rates of non-native and rare species.

Methods

Study areas

We selected four Lake Erie locations for the present study (Fig. 1): Detroit River (MI/ON), Maumee Bay (MI/OH), Sandusky Bay (OH), and Cleveland (OH). These locations are among the 21 sampled in the Lake Erie basin by the USFWS Alpena FWCO EDM program between 2013 and 2022 (USFWS 2022). Locations were identified for EDM sampling due to their high risk of non-native species introductions (Tucker et al. 2020; USFWS 2022). We selected these four locations for the present study because of their robust sample sizes, the high degree of habitat heterogeneity among them, and the consistency in the sampling approach at each location.

Fig. 1
figure 1

Lake Erie locations [Detroit River (a), Maumee Bay (b), Sandusky Bay (c), and Cleveland (d)] sampled for the early detection of novel non-native species using a stratified random sampling design (“Random”, 2013–2018) and a selective sampling design (“Selective”, 2021–2022)

The Detroit River includes the last 44 km of the connecting channel between Lake Huron and Lake Erie. In addition to being an active shipping port, the Detroit River is a busy navigational shipping route, connecting the upper Great Lakes (i.e., lakes Huron, Michigan, and Superior) with the lower Great Lakes (i.e., lakes Erie and Ontario) and the Atlantic Ocean. The river also acts as the international border between the United States and Canada, with large population centers in Detroit, MI, and Windsor, ON. Heavy shipping traffic and large population centers greatly increase the risk of non-native species introductions, resulting in this location’s ranking as one of the most at-risk in the Great Lakes watershed (Tucker et al. 2020). Much of the Detroit River consists of hardened shorelines. However, there are several areas of critical fish and wildlife habitat, resulting in the creation of the Detroit River International Wildlife Refuge in 2001. The sample area for the present study encompassed both United States and Canadian waters throughout the length of the river (Fig. 1a).

Maumee Bay is located near Toledo, OH in the western basin of Lake Erie. The Toledo metropolitan area is the fifth largest in Ohio by population size (U.S. Census Bureau 2019) and serves as a major commercial shipping port and recreational boating, fishing, and tourism center. Prior to 2015, the Eagle Marsh Nature Preserve in Allen County, IN served as a potential pathway for expansion of AIS from the Mississippi River basin to the Maumee River during high water events; a berm was constructed in 2015 to sever this connection (Murchy et al. 2017). Habitat within Maumee Bay is variable, with hardened and natural shorelines both well represented and sampled depths ranging from < 1 to 11 m. The present study area of Maumee Bay extends from 9 km upstream of the river mouth to a parallel boundary extending from the Cedar Point National Wildlife Refuge to Grand View, MI (Fig. 1b).

Sandusky Bay is located in the western basin of Lake Erie, near Sandusky and Port Clinton, OH. Although these two cities have comparatively small population sizes, the area attracts anglers from all U.S. states and the Canadian province of Ontario (J. Windau, Ohio Department of Natural Resources, personal communication). Sandusky Bay has a smaller commercial shipping port than Maumee Bay or Detroit River, but this pathway for non-native fish introductions persists, nevertheless. The bay is largely shallow and homogeneous in its offshore habitat; shorelines consist of sea wall, rip rap, and natural shoreline. The study area of Sandusky Bay extends from the east edge of Muddy Creek Bay to the Sandusky Bay mouth near Cedar Point (Fig. 1c).

Cleveland, OH is located at the mouth of the Cuyahoga River in the central basin of Lake Erie. The Cleveland metropolitan area is the largest in Ohio by population size (U.S. Census Bureau 2019) and, like Toledo, serves as a major international commercial shipping port and recreational boating and fishing center. Although natural shoreline can be found upstream in the Cuyahoga River, the habitat within the sample area largely consists of hardened shoreline. The study area of Cleveland extends from Edgewater Park northeast to Cleveland Lakefront Nature Preserve (Fig. 1d).

Sampling strategies

The Alpena FWCO implemented stratified random sampling between 2013 and 2018 in the Detroit River (no sampling in 2017), Maumee Bay, and Sandusky Bay, and between 2017 and 2018 in Cleveland. Random points were assigned inside depth strata using ArcGIS 10.1/10.2 using the Create Random Points function. A wide variety of gears were utilized during this sampling period; however, for the purpose of the present study, we reduced samples to only those gears that were used at all locations (i.e., boat electrofishing, bottom trawl, and paired and single fyke nets; Table 1). Gear utilization was planned to be equal between the gear types; however, the final ratios of gears used during the random sampling period were heavily influenced by competing priorities of the USFWS Alpena FWCO EDM Program (Table 2). Sampling dates ranged from June 25 to November 5 in the Detroit River, May 22 to October 27 in Maumee Bay, May 23 to October 30 in Sandusky Bay, and May 12 to November 8 in Cleveland (Fig. 2).

Table 1 Field sampling gears, specifications, and deployment methods used in the Lake Erie basin by the Alpena FWCO EDM program during random and selective sampling designs, 2013–2022
Table 2 Sites sampled with boat electrofishing (EF), bottom trawl (BT), and fyke nets (paired and single fyke nets combined; FN) using random and selective sampling designs in Detroit River, Maumee Bay, Sandusky Bay, and Cleveland
Fig. 2
figure 2

Temporal comparison of sites sampled during random and selective sampling designs in the Detroit River, Maumee Bay, Sandusky Bay, and Cleveland

We implemented selective sampling in 2021–22 in all four locations. We selected sites during the selective sampling period in one of two ways. First, sites where rare and/or occasional species [as defined by Silver et al. (2017)] had been captured in previous surveys were prioritized for resampling. For example, we utilized Alpena FWCO EDM program data collected in 2013–2019 to identify 2021 selective sites, while we based 2022 selective sites on data collected in 2013–2019 and 2021; no sampling was conducted in 2020 due to the Covid-19 pandemic. The second manner in which we selected sites during the selective sampling design period was termed “exploratory”. Exploratory sites were in areas within the location that we had not previously sampled with the intention of expanding habitat coverage. For instance, a log jam would have been selected for exploratory sampling if no other large woody debris piles had been sampled at that location, and it represented a novel habitat type to the survey. We identified these sites based on the professional judgement of the biologist and crew members during sampling. Approximately half of the sites sampled at all locations during the selective sampling design period were resampled sites selected based on previous data, while the other half were exploratory sites. Additionally, during the selective sampling period, the gears used and the ratios of sites sampled by each gear (i.e., the gear allocation) at each location shifted from those utilized during the random sampling period toward the gears that had detected the largest proportion of rare species (e.g., in Maumee Bay, boat electrofishing effort was increased, and bottom trawl effort was decreased; Table 2). The habitat characteristics of exploratory sites were used to guide the gear type deployed there (e.g., shorelines ≤ 2 m depth were sampled with boat electrofishing). Sampling dates each year ranged from March 31 to October 27 in Detroit River, April 11 to November 10 in Maumee Bay, May 5 to November 9 in Sandusky Bay, and May 10 to November 3 in Cleveland (Fig. 2). Sites sampled in 2019 were not included in the present study because the sampling design was dissimilar from both the random and selective sampling designs presented here.

Data analyses

We compared the random and selective sampling strategies three ways at each location. First, we calculated sample-based rarefied and extrapolated species accumulation curves with 50 bootstrap replications and species richness asymptotes with 95% confidence intervals for each location using the iNEXT package (Chao et al. 2014; Hsieh et al. 2016) in R version 4.2.1 (R Core Team 2022), and we compared species accumulation trends and asymptotes of the two strategies. We evaluated the number of samples required to reach 50% (t50) and 95% (t95) of the random and selective sampling curves to describe the initial species accumulation (50%) and the effort required to reach a high proportion of the curve asymptote (95%; Chao et al. 2009). These benchmarks were chosen because 50% of the curve asymptote is the minimum estimated sample coverage suggested by Chao and Jost (2012) and Hoffman et al. (2016), while 95% of the curve asymptote indicates the detection of exceedingly rare species (Venette et al. 2002; Chao et al. 2009; Hoffman et al. 2011, 2016). Results for each sampling design at each location were pooled across all years and gear types. This analysis was performed as a qualitative evaluation of the ability of each sampling design to detect a large number of species and their efficiency in doing so.

Second, we compared the average yearly overall species richness observed during each sampling design in each location to a list of the established contemporary species of the location as a qualitative assessment of survey performance. The contemporary species list consists of established species in the sample area and is used as a complement to the species accumulation curve analyses to assess sampling completeness. To determine the proportion of the fish community that was detected, we divided the observed number of species by the total number of contemporary established species, expressed as a percentage. We formulated contemporary species lists using catch data and professional input provided by USFWS, Michigan Department of Natural Resources, Ohio Department of Natural Resources, Fisheries and Oceans Canada, Ontario Ministry of Natural Resources and Forestry, and Ohio State University. We also conducted an in-depth literature review to develop a list of species to be considered for inclusion (Cudmore-Vokey and Crossman 2000; Hubbs and Lagler 2004; Simon et al. 2016; Rice and Zimmerman 2019). We excluded species from the contemporary species list if they met one or more of four criteria:

  1. 1.

    Species’ predominant habitat is outside the sampling area (e.g., Orangethroat Darter Etheostoma spectabile).

  2. 2.

    Species is transient and unlikely to establish in Lake Erie (e.g., Atlantic Salmon Salmo salar).

  3. 3.

    The established range of the species is outside the sampling area (e.g., Longnose Sucker Catostomus catostomus).

  4. 4.

    The species has not been detected in Lake Erie or its tributaries since before 2000 (e.g., Pirate Perch Aphredoderus sayanus).

Lastly, we used species richness and presence/absence metrics to compare the observed species richness, non-native species detections, and rare species detections between the random and selective sampling designs. Although detections of novel non-native species are the primary purpose of AIS early detection programs, only a small fraction of the non-native species established in the four locations sampled can be considered rare and geographically restricted. Additionally, given the ambiguity between a species’ presence but lack of detection and its true absence, detection rates of all known established rare species can be used as a proxy for newly introduced non-native species (Hoffman et al. 2016; Trebitz et al. 2017). Therefore, detection rates of rare and non-native species and overall species richness were utilized to compare the random and selective sampling designs.

We compared site-specific species richness between sampling designs at each location using a generalized linear mixed model with a negative binomial distribution. Fixed effects included sampling design, location, gear, and sampling design-location interactions. The sampling design-location interaction was included to account for possible location-specific differences in sampling design effectiveness. Year was included as a random effect to account for temporal differences while assuming exchangeability across years. We performed pairwise post-hoc analyses to compare the sampling designs within each location separately, with a multivariate t p-value adjustment to correct for multiple comparisons. Due to limited detections of non-native species and rare species, we reduced non-native species and rare species counts to presence/absence of any non-native or rare species at each site instead of richness and analyzed each with a binomial distribution. We categorized rare species as those in the bottom 25th percentile in incidence and abundance in capture data from both sampling strategies combined (Silver et al. 2017). These evaluations served as quantitative comparisons of the two sampling designs, under the assumption that differences between individual units of effort scale up to differences at the location level. Models were run in R version 4.2.1 using the gmmTMB package (Brooks et al. 2017), with post-hoc analyses performed using the emmeans package (Lenth 2023). All graphs were constructed using the ggplot2 package (Wickham 2016). Significance was accepted at p < 0.05.

Results

A total of 122 and 114 sites were sampled during the random and selective sample designs, respectively, in Detroit River, 242 and 170 sites in Maumee Bay, 240 and 87 sites in Sandusky Bay, and 69 and 72 sites in Cleveland. A total of 57 and 55 species were detected under the random and selective designs, respectively, in Detroit River, 52 and 55 in Maumee Bay, 41 and 47 in Sandusky Bay, and 39 and 40 in Cleveland (Table 3).

Table 3 Effort (t) and species (n) required to reach 95% and 50% of the species accumulation curve asymptotes in Detroit River, Maumee Bay, Sandusky Bay, and Cleveland under random and selective sampling designs

Species accumulation curves

The species accumulation curves of the selective sampling designs increased more rapidly than those of the random sampling designs in Maumee Bay and Sandusky Bay (t50 equal to 7 vs 14 sites and 7 vs 19 sites, respectively), while the two curves were nearly identical in Detroit River and Cleveland (t50 equal to 9 vs 12 sites and 8 vs 12 sites, respectively; Fig. 3; Table 3). In addition, the asymptotes of the selective and random sampling designs were nearly or completely equal in all locations. It is important to note, however, that 95% confidence intervals of the asymptotes were rather large and exhibited a high degree of overlap at all locations except Maumee Bay.

Fig. 3
figure 3

Rarefied and extrapolated species accumulation curves, their asymptotes, and 95% confidence intervals of each for random and selective sampling designs in Detroit River, Maumee Bay, Sandusky Bay, and Cleveland. Asymptotes for random and selective sampling protocols in Detroit River are identical. The x-axes, “Number of Sampling Units”, denotes the number of sites (defined for each gear in Table 1; e.g., one electrofishing run, one fyke net-night)

The estimated effort to reach t50 was slightly lower during the selective sampling design in all locations, indicating that although the asymptotes were similar, there may have been a slight increase in the initial slopes of the species accumulation curves (i.e., fewer samples were needed to collect the same number of species with the selective sampling design; Table 3). Furthermore, the t95 of the selective sampling design was lower than that of the random design by 23, 40, and 66% in Cleveland, Maumee Bay, and Sandusky Bay, respectively. The t95 of the selective sampling design was slightly higher than the random design in Detroit River, but only by a comparatively small margin (9%; Table 3).

Contemporary species

The contemporary species lists for each location contained a total of 92, 77, 74, and 66 species in Detroit River, Maumee Bay, Sandusky Bay, and Cleveland, respectively (Appendix Table 5). In the present study, the random sampling design on average captured a lower percentage of the contemporary species list per year than the selective sampling design in Detroit River (41.5 ± 5.0% vs 52.7 ± 1.5% per year, respectively), Maumee Bay (42.0 ± 7.0% vs 63.3 ± 3.6% per year, respectively), Sandusky Bay (38.2 ± 3.5% vs 56.0 ± 5.7% per year, respectively), and Cleveland (49.2 ± 5.4% vs 51.5 ± 2.1% per year, respectively; Fig. 4).

Fig. 4
figure 4

Mean annual percentage of the contemporary species list (CSL) captured in Detroit River (DTR), Maumee Bay (MAU), Sandusky Bay (SAN), and Cleveland (CLE) under a random and selective sampling design. Error bars denote the standard deviation

Richness and presence/absence metrics

The average overall species richness detected at each site sampled under the selective sampling design exceeded that of the random sampling design by 130% in Detroit River (10.8 ± 3.6 vs 8.3 ± 4.5, respectively; p < 0.0045), by 160% in Maumee Bay (10.9 ± 4.1 vs 6.9 ± 4.1, respectively; p < 0.001), by 150% in Sandusky Bay (9.7 ± 4.1 vs 6.5 ± 3.3, respectively; p = 0.0053), and by 140% in Cleveland (7.9 ± 3.1 vs 5.5 ± 3.2, respectively; p = 0.0259; Fig. 5). The selective sampling design also correlated with a larger proportion of sites where non-native species were detected in the Detroit River (0.77 ± 0.45 vs 0.72 ± 0.42, respectively; p = 0.0232), and Maumee Bay (0.91 ± 0.28 vs 0..76 ± 0.43, respectively; p < 0.001); the non-native species detection rates were not significantly different in Sandusky Bay (0.93 ± 0.25 vs 0.85 ± 0.36, respectively; p = 0.1072) or Cleveland (0.71 ± 0.46 vs 0.78 ± 0.42, respectively; p = 0.6076). Lastly, rare species were detected at a larger proportion of sites in the selective sampling design than the random sampling design in Maumee Bay (0.06 ± 0.24 vs 0.01 ± 0.11, respectively; p = 0.0472). The detection rate of rare species was not significantly different between the two sampling designs at the remaining three locations: Detroit River, 0.10 ± 0.30 vs 0.07 ± 0.25, respectively (p = 0.7015), Sandusky Bay, 0.03 ± 0.18 vs 0.01 ± 0.09, respectively (p = 0.1350), and Cleveland, 0.04 ± 0.20 vs 0.03 ± 0.17, respectively (p = 0.8049; Table 4).

Fig. 5
figure 5

Mean overall species richness per site and proportion of sites detecting non-native and rare species during random and selective sampling in Detroit River (DTR), Maumee Bay (MAU), Sandusky Bay (SAN), and Cleveland (CLE). Error bars indicate standard deviation; asterisks indicate a significant difference (p < 0.05)

Table 4 Estimated difference between selective and random sampling regimes for overall species richness, non-native species presence/absence, and rare species presence/absence at the four survey locations

Discussion

In the present study, the use of a sampling design biased toward maximizing detections of rare and occasional species resulted in increased efficiency and comparable or better species detection at all locations analyzed. The degree to which sampling efficiency and completeness is improved using a selective approach versus a random survey design is influenced by differences in the habitat characteristics and system dynamics of locations at high risk for AIS introduction (Hoffman et al. 2016). Here we discuss underlying reasons for the variation in survey improvements across the locations we sampled, and the implications thereof, including the importance of identifying microhabitats for exploratory sampling and increasing sample effort at sites that previously produced high incidence of rare species. We also highlight how the use of a contemporary species list provides insights into sampling completeness and efficiencies gained from a selective design. In general, our results provide further evidence that selective designs, when built upon a foundation of randomized data as part of an adaptive management approach, can improve sampling efficiency and completeness across sites that vary in size, habitat heterogeneity, and dominant invasions pathways.

The slopes of the species accumulation curves suggested that the selective sampling design consistently increased the sampling efficiency. The t50 was lower for the selective sampling design than the random sampling design at all locations, indicating fewer sites were necessary to detect 50% of the asymptotic species richness. Additionally, the t95 was markedly lower for the selective sampling design in three of the four locations, and only slightly higher in the fourth location. The asymptotes themselves were nearly identical in all four locations, suggesting that even with the reduced spatial coverage inherent in the selective design, it did not have a reduced capacity to detect a wide variety of species. However, comparing asymptotic values of the two sampling designs in the Detroit River, Sandusky Bay, and Cleveland is somewhat dubious due to wide and overlapping 95% confidence intervals. Nevertheless, these results are similar to those of Hoffman et al. (2011) and Hoffman et al. (2016), who demonstrated an increase in both species accumulation curve slopes and asymptotes in the Port of Duluth-Superior following site and gear selection optimization, respectively. Harris et al. (2018) illustrated similar increases in lower Green Bay, Lake Michigan over multiple yearly programmatic adjustments in an adaptive management framework. The general improvements in the t50 and t95 values under the selective sampling design while maintaining similar asymptotes in comparison to the random sampling design indicate the selective sampling method is successful for increasing EDM program efficiency.

One limitation of species accumulation curves and asymptotic species richness is the lack of perspective regarding the actual number of species present in the system. For instance, a species accumulation curve may have reached 99% of its asymptote, but if that asymptote is only 25% of the species known to be established in the system, the program is unlikely to catch a novel non-native species under broad-spectrum monitoring. The asymptotes of the species accumulation curves of both sampling designs at all locations were consistently lower than the number of species known to be established there. These asymptotes are simply estimates of the number of established species in the system and rely on the assumption that all established species have some vulnerability to capture. In these instances, the asymptotes of the species accumulation curves estimated the number of species vulnerable to the sampling methods, and not the true richness of the location. Species that are not vulnerable to capture (e.g., deep water species during a nearshore survey) would not contribute to the species richness estimate. This is the advantage of the contemporary species list in evaluating comprehensive survey coverage of all known established species. In the present study, the average percentage of the contemporary species list detected each year under the random sampling design increased by at least 3% (Cleveland, from 49 to 52% per year on average) and by as much as 21% (Maumee Bay, from 42 to 63% per year on average) under the selective sampling design. These percentages are still well below the recommended 95% overall species detection rate suggested by Chao et al. (2009) and Hoffman et al. (2016); however, these contemporary species detection rates describe only the average percent detected in a single year. Contemporary species detection rates will increase over longer temporal sampling frames as fish populations fluctuate and new species are detected. For instance, at all locations combined, 63 and 70% of the contemporary species list was detected in survey years 2021 and 2022, respectively, while a total of 72% of the contemporary species list was detected over the two years combined. Additionally, the 95% recommended detection rate has historically been applied to the species accumulation curve asymptote, not the number of known established species in the system, which is considerably higher at all four locations.

Similar to the improvements in contemporary species list detection rates, overall species richness significantly increased under the selective sampling design in each of the four locations sampled. Improvements in the numbers of species detected indicate new habitats were incorporated during the selective survey and/or previous habitats were sampled more thoroughly. Therefore, with this increase in coverage, it is more likely that a novel non-native species will be detected with less effort with the selective sampling design compared to the random design.

Additionally, presence of non-native species significantly increased in Detroit River and Maumee Bay. It is possible that Sandusky Bay and Cleveland did not exhibit an increase in non-native species detections during the selective sampling design because they contain relatively little habitat heterogeneity in comparison to Detroit River and Maumee Bay. This lack of habitat heterogeneity may have resulted in a more homogeneous species distribution (Benson and Magnuson 1992), increasing the likelihood of high species detection rates during the random sampling design. Detroit River and Maumee Bay, conversely, have higher habitat heterogeneity (i.e., small and large rivers, soft and hard substrates, natural and hardened shorelines, and large nursery areas) that likely increase patchiness in fish distributions (Benson and Magnuson 1992), which the selective sampling design was tailored to exploit.

The goal of AIS early detection is to capture a novel non-native species while it is still in low abundance and spatially restricted (i.e., rare; Myers et al. 2000; Hulme 2006; Mehta et al. 2007; Trebitz et al. 2009; Hoffman et al. 2016; Lodge et al. 2016). The invasion curve illustrates how a novel non-native species will be in low abundance and spatially restricted immediately following its introduction. While it is not possible to estimate the ability to detect a species that may not even be present, the next best option is to use known established species that have similar abundance and incidence characteristics as novel non-native species (i.e., established species that are in low abundance and/or spatially restricted). In the present study, Maumee Bay was the only location in which the selective sampling design resulted in a significant increase in the proportion of sites where rare species were detected. In addition to increased patchiness in fish distributions, high habitat heterogeneity has also been linked to increased prevalence of rare species (Miller et al. 2018). Maumee Bay was likely the only location among the four sampled in which an increase in rare species presence was observed under the selective sampling design at least in part due to it having the highest degree of habitat heterogeneity. Furthermore, the exceedingly low abundance and spatial distribution of rare species reduces the likelihood for a statistically significant difference in detection rates during a broad-spectrum AIS early detection effort.

These metrics (i.e., overall species richness, non-native species presence, and rare species presence) further provide support that this selective sampling design increases the likelihood of detecting a non-native species while still rare and geographically isolated. While the selective sampling design was successful at increasing site-level species richness at all locations, the greatest increases in non-native species presence and rare species presence were in locations with patchier fish distributions. Given high patchiness in fish distributions can be explained by high habitat heterogeneity, the selective sampling design will likely be the most effective in locations with highly heterogeneous habitats. Harris et al. (2018) demonstrated a similar increase in detections of rare species in lower Green Bay following their yearly adaptive management framework. Additionally, Trebitz et al. (2009) successfully increased observed overall, non-native, and rare species richnesses in the St. Louis River/Duluth-Superior Harbor by utilizing past sampling data to exploit patterns in habitat and gear associations by these species. The success of adaptive management methods to increase the efficiency of AIS early detection in these two locations plus those presented here provides strong support for its implementation elsewhere.

An important aspect in the selective sampling design is the identification of previously sampled sites that produced rare species in the past and their prioritization for resampling in subsequent field seasons. This method for site selection necessarily reduces spatial coverage as sampled areas that had not produced rare species in past years are not resampled. However, an equally important site selection method in the selective sampling design was exploratory. This involved utilizing the expertise of the biologists in the field to select sample sites that expanded the habitat coverage of the survey. For instance, a microhabitat within a location (e.g., a log pile, a patch of rare vegetation, etc.) that had not been sampled in the past would not be identified as a resample site under the selective sampling design, but likely would be identified as a unique habitat by the biologist in the field and sampled as an exploratory site. Both site selection methods utilized in the present study were vital for increasing survey efficiency based on the analyses shown here. However, it is important to note that the success of exploratory sites in locating rare species may vary, possibly due to differing levels of biologists’ expertise in exploratory site selection, or unseen or discrete changes in habitat over time. Therefore, in some situations it may be beneficial to implement a hybrid sampling approach between the random and selective designs, such as 75% of sites sampled under the selective sampling design with 25% of sites sampled with a stratified random design. Such an approach would guard against the perception that species were not detected simply for lack of effort in some habitats on the presumption that the surveyors knew where to look for novel AIS.

Early detection programs for invasive species are commonly plagued by the large amount of time and high cost associated with collecting enough fish to detect a large proportion of the species present in a system (Hulme 2006; Trebitz et al. 2009; Hoffman et al. 2016). It is therefore imperative to explore avenues for increasing overall program efficiency. In the present study, the results of previous research [i.e., Trebitz et al. (2009), Hoffman et al. (2011), Hoffman et al. (2016), Harris et al. (2018)] were used to guide the design of a selective sampling approach to increase AIS early detection program efficiency. The selective sampling approach in the present study was evaluated in four diverse locations simultaneously. The novel sampling design, tailored to captured rare and occasional species through biased site selection and gear allocation, was successful in increasing program efficiency in all four Lake Erie locations. The wide variability in habitat characteristics among Detroit River, Maumee Bay, Sandusky Bay, and Cleveland provide strong evidence for the likely success of a similar sampling design elsewhere. The results of this study indicate species detection rates can be improved across a wide variety of locations with an optimized gear allocation and a sampling design targeted toward increased detection rates of rare and occasional species. These increased species detection rates increase the confidence that a novel nonnative species will be detected soon after introduction, maximizing the likelihood for successful control or eradication. Operationally, we recommend that AIS early detection surveys in large, complex locations consist of a random, spatially balanced sampling design with habitat specific gears at time zero, to be implemented for a period of three to five years. This sampling period should be followed by an adaptive survey approach consisting of a selective sampling design that both targets sites that previously produced rare species and unsampled microhabitats in subsequent sampling periods. Additionally, technological advancements such as eDNA metabarcoding (Valentini et al. 2016; Lacoursière-Roussel et al. 2018; Yao et al. 2022), habitat and species distribution modeling (Cook et al. 2019; Coulter et al. 2022), and advanced geographic information systems (Guisan and Thuiller 2005; Gallo and Waitt 2011) will increasingly provide further opportunities for enhanced program efficiency for broad-spectrum AIS early detection programs worldwide.