Introduction

Concern regarding the degradation of wetlands and loss of wetland services has led to the creation of policies aimed at conserving wetlands. A common impediment to wetland policy implementation is the difficulty in evaluating wetlands at the level necessary to inform land use planning. Intensive approaches to wetland assessment that require site visits, such as indices of biotic integrity (IBIs) or rapid assessment methods, are well established and broadly adopted (e.g., Barbour and Yoder 2000). Unfortunately, the cost and time requirements associated with site visits prohibit the application of intensive methods across broad land use management areas (Brooks et al. 2004).

To facilitate land use planning and to provide habitat managers with greater flexibility in wetland assessments, past research has developed GIS-based assessment tools (e.g., Phillips et al. 2005; Mita et al. 2007; Reiss and Brown 2007). GIS-based tools enable the evaluation of wetland condition using airborne or satellite remotely sensed data, eliminating the need for expensive and time consuming site visits. However, these tools are predicated on the assumption that landscape composition and configuration are predictive of biotic integrity at individual wetlands.

Unfortunately, the relationship between surrounding landscape and wetland condition is not always strong (e.g., Tangen et al. 2003) and may vary with a watershed’s hydrological transport capacity (Fraterrigo and Downing 2008). In the Aspen Parkland Ecoregion, for example, most wetlands lack stream inputs, and receive relatively little surface run-off (Devito et al. 2005), meaning that the mechanism typically connecting wetlands to uplands is likely less active than in regions with greater surface water run-off and stream inputs. Without strong predictive relationships, GIS-based assessments may provide misleading evaluations of wetland condition. The lack of strong predictive relationships could result because GIS data are incorrect (e.g., out of date); because of time-lags between disturbance in the surrounding landscape and conditions within the wetland (e.g., Findlay and Bourdages 2000); or because natural variability in ecological and hydrological functions mask the response of wetland biota to disturbances. Furthermore, spatial autocorrelation among land cover types may confound any observed relationships between individual land covers and wetland condition, as has been observed in lotic systems (King et al. 2005). Spatial autocorrelation is the property of having a non-random distribution: it is common in nature as environmental variables are frequently clustered or spread over gradients (Legendre 1993). If the distribution of land covers is non-random, an apparent relationship between land cover and wetland condition could be the result of some unmeasured causal factor that determines the distribution of land cover. Although spatial autocorrelation likely presents a problem for any correlation-based study relating ecological condition to surrounding land cover, it is rarely measured (King et al. 2005).

Even if a relationship between surrounding landscape and wetland condition is strong, it is likely to be influenced by the spatial extent (sensu Turner et al. 1989) at which landscape characteristics are considered (Rooney and Bayley 2011). Different taxa interact with their habitat at different spatial extents or functional grain-sizes (Romero et al. 2009). Thus, the extent at which landscape characteristics will be most predictive of biotic integrity will depend on the biotic assemblage used to measure integrity (Levin 1992; Paltto et al. 2006). For example, mobile birds might be expected to interact with, and thus be influenced by, a larger area of land surrounding a wetland than stationary plants, yet both are commonly used as the basis of IBI development. Issues of landscape extent have long been acknowledged in the field of landscape ecology (e.g., Turner et al. 1989; Wu 2004; Buyantuyev and Wu 2007), yet rarely do studies that propose methods of assessing wetlands using remotely sensed data articulate the issue of optimizing landscape extent for the taxon or assemblage of interest. The need to optimize landscape extent for the taxon of interest has been acknowledged in work on lacustrine wetlands (e.g., Brazner et al. 2007), but, to the best of our knowledge, not in the shallow open-water wetlands characteristic of much of central North America. Thus, our primary goal was to articulate the importance of optimizing landscape extent to the successful use of remotely sensed data in regional wetland assessments.

To develop a valid GIS-based wetland assessment tool, it must first be demonstrated that land cover is predictive of biotic integrity at individual wetlands. Biotic integrity can be represented by a quantitative measure like an IBI score (Karr 1991). Two IBIs have been developed and tested for use in shallow open-water marsh wetlands of Alberta, one reliant on the vegetation community and the other on the wetland-dependent songbird and shorebird community (Wilson and Bayley 2012). Thus, our first objective was to determine whether land cover could predict these IBI scores and, if so, to identify the strongest and most significant model for predicting plant- and bird-based IBIs, respectively. If our hypothesis that land cover is capable of predicting vegetation- and bird-based IBI scores is supported, the next issue to address is whether those relationships vary with landscape extent. Our second objective, therefore, was to identify the optimal spatial extent at which land cover predicts IBI scores and to ascertain whether this optimal extent is the same for IBIs based on both vegetation and wetland-dependent songbirds and shorebirds. We hypothesize that, given the differences in mobility between these two taxa, bird-based IBI scores will be best predicted by land cover data extracted from larger landscapes.

We were aware that spatial autocorrelation among land covers might bias our conclusions regarding our first two objectives (King et al. 2005). If the distribution of land covers is non-random, but rather depends on the location of the site in question, we might wrongly conclude that land cover is predictive of biotic integrity when, in fact, it is location or some other underlying characteristic of the environment (i.e., spatial dependency) that is influencing biotic integrity. Thus, we sought to confirm that our conclusions regarding the capacity of land cover data to predict wetland IBI scores and the optimal spatial extent at which such predictions are made were not merely the consequence of the non-random distribution of land covers.

Within the jurisdiction where we work, the percent of undisturbed land within 100 m of the open-water boundary has been proposed as a simple estimate of wetland condition that can be measured remotely. Similar measures have been successfully used as proxies of detailed land cover elsewhere (e.g., Miller et al. 1997; Brooks et al. 2004; Wardrop et al. 2007; Sundell-Turner and Rodewald 2008). Although simpler to obtain and interpret, such proxies may exclude important information pertinent to biotic integrity, introducing additional error into GIS-based assessments. For example, not all forms of disturbance can be expected to affect biotic integrity equally: urban or industrial development would likely have a stronger influence on the biotic condition of a nearby wetland than low-intensity agriculture (Forrest 2010; Rooney and Bayley 2012). Thus, our third objective was to contrast a model that predicts IBI scores from the percent of undisturbed land surrounding each wetland with more sophisticated models, derived from detailed land cover data, divided into 11 distinct land cover classes.

Methods

Study area

The 45 wetlands selected for sampling are situated in the Beaverhills watershed of the Aspen Parkland Ecoregion of Alberta, Canada (53.54°N latitude and 113.50°W longitude), which drains into the North Saskatchewan River. This Ecoregion incorporates the transition zone between northern prairie and southern boreal habitats. The region is generally flat and drainage is poor. Climate is temperate with a daily mean temperature of 2.4 °C, with a maximum in July (mean high of 22.2 °C) and a minimum in January (mean low of −19.1 °C) (EC 2011). Precipitation averages 482.7 mm annually with 374.8 mm falling as rain (EC 2011), although there is substantial inter-annual variability. Vegetation transitions from closed aspen forest in the northern part of the Beaverhills watershed to grassland with aspen patches in the south. Wetlands typical of the region are isolated with few surface water inlets or outlets and drainage is primarily via groundwater recharge (Holden 1993).

We chose 45 shallow open-water marsh wetlands from a list of candidates within the Beaverhills watershed that were identified from 2007 aerial photography (Fig. 1). The wetlands were selected to represent a range of disturbance. Twenty-five were relatively undisturbed, situated in parks or other protected habitat. Fourteen included agricultural activity within 500 m of their open-water boundaries. The remaining six were constructed wetlands (age > 3 years), built to provide storm water storage for the Edmonton urban area. All wetlands ranged between 1 and 11 ha and included an open water zone.

Fig. 1
figure 1

Map of the location of our 45 study sites and the Beaverhills watershed. The national and provincial parks indicated include the Elk Island National Park and the Beaverhill-Cooking Lake Recreational Area

IBI

We used two different IBIs to measure biotic integrity in each wetland. The first was based on vegetation community data, the second on wetland-dependent songbird and shorebird community data (hereafter the bird-based IBI). Sampling followed methods outlined in Rooney and Bayley (2012), and occurred during the summers of 2008 and 2009. In brief, vegetation was sampled from six quadrats deployed within the wet meadow zone of each wetland in August, when peak biomass is expected. The percent cover of each species present was recorded with taxonomy following Moss and Packer (1983) and names updated using the Integrated Taxonomic Information System online database (ITIS 2011). Quadrat results were averaged to yield data on a per wetland basis. For wetland-dependent songbirds and shorebirds, sites were visited three times during the breeding season (May–July). Three sites were visited between sunrise and 10:30 a.m. each day, and site order was rotated so that each site was visited once at sunrise, once at the middle period, and once at the latest period of the morning. On each visit, auditory surveys (8 min, 50 m fixed-radius point counts) were carried out at two locations spaced at least 150 m apart. All target bird species detected by sight or sound were recorded. Identifications followed the American Ornithologist’s Union standard (Poole 2005). The two point counts were summed, and the maximum count from the three visits was taken to yield counts on a per wetland basis.

IBI scores were calculated following Wilson and Bayley (2012). From the vegetation community data, we extracted metrics including the FQI score (Miller and Wardrop 2006; Forrest 2010), the relative cover of native perennials, the relative cover of sedge species, and the width of the wet meadow zone. From the wetland-dependent songbird and shorebird community data, we calculated metrics including the richness of temperate migrants, richness of Passeriformes, and the relative abundance of ground nesters, canopy foragers, and omnivores. We measured the Pearson’s correlation between the two IBI scores to evaluate the level of agreement between them, using SYSTAT software (SYSTAT 2007).

Land cover

Satellite imagery was collected on 1 September 2009 and consisted of 2.5 m panchromatic and 10 m multispectral SPOT imagery, which was provided by the Alberta Terrestrial Imaging Centre. We classified the imagery into 16 land cover types using the fuzzy k-means unsupervised classification tool in Geomatica Focus (PCI 2007). Classes were first identified through visual assessment of the SPOT image, Google Earth images, and a 1 m 2009 air photo. Manual editing was performed to reclassify incorrectly classified areas on the map. Significant overlap of land covers was observed in 4 of the 16 classes. Each of these four overlapping classes was masked and an unsupervised classification using six classes was conducted on each. Classes were then classified into land covers using the SPOT image, Google Earth images, and 2009 air photo for validation. The land cover classes were then merged where appropriate (i.e., agriculture with agriculture, etc.) to create a final map with 11 land cover classes with 5 m grid resolution (Supplementary Table S1).

Next, land cover patches were converted to polygons in ArcMap (ESRI 2011) and a series of seven nested buffers were created ranging in radius: 100, 300, 500, 1000, 1500, 2000, and 3000 m. These buffers were generated from the perimeter of the open water of each wetland (i.e., between standing water and emergent vegetation). The area of each land cover class within each buffer was calculated using zonal statistics. Absolute areas were converted to proportional data to standardize among wetlands. Road density was calculated as the sum of the length of all linear road features within each buffer divided by the total buffer area. Road features were taken from the Altalis 1:20,000 base feature road vector layer. The proportion of undisturbed land within each buffer was calculated as the sum of the proportion of grass and forest cover.

Spatial autocorrelation

Positive spatial autocorrelation is the property whereby two points located next to each other are more similar than two points located far apart (Legendre 1993) and results from some underlying spatial dependency (sensu Goodchild 1992). We measured spatial autocorrelation in a subset of four of the 11 land cover classes using the Moran’s I index, which typically ranges from −1 to +1 with a zero value indicating a random distribution, positive values indicating that a land cover’s distribution is clustered (i.e., positive spatial autocorrelation), and negative values indicating a tendency to over-disperse (Moran 1950). We were interested in the forest, agricultural, urban, and roads land cover classes because of their importance in models predicting IBI scores (see “Results” section). Beginning with the land cover map described above, we ran a two pass 3 × 3 majority filter to remove noise and single pixel features. We then converted this filtered map into binary maps for each land cover class of interest and calculated Moran’s I index value, Z scores, and p values in ArcMap (ESRI 2011) to test the null hypotheses that each class was randomly distributed across the study area.

Modeling

We used backwards stepwise general linear modeling (GLMs) with maximum likelihood estimation in SYSTAT (SYSTAT 2007) to identify the best model of the two IBIs using the relative cover of the 11 land cover classes and road density extracted from each buffer width separately. This modeling approach eliminates issues associated with multi-collinearity among proportional land cover data by automatically accounting for simultaneous contributions from multiple predictors. In order to reduce heteroscedasticity and improve the normality of GLM residuals, land cover data was 2/π × arcsine (square-root(x)) transformed (recommended by McCune and Grace (2002) for percent cover data) whereas road density was log(x + 1) transformed prior to analysis. The model tolerance was 1 × 10−12, the probability threshold for a variable to enter or be removed from the model was 0.1, and each GLM was limited to 20 iterations. We also used regression tree modeling in CART (Steinberg and Colla 1997) to confirm the results of GLM, growing the maximal model with least squares regression and pruning with tenfold cross validation to the minimum cost model. Regression tree models were compared based on their r 2 values.

We identified the optimal landscape extent as that which yielded the model with the greatest (1) statistical significance (F value, p value); (2) predictive strength (r 2 value); and (3) balance between model accuracy and complexity, using Akaike’s information criterion as corrected for small sample sizes (the AICc value), which is a method analogous to the optimal zoning approach described by Jelinski and Wu (1996). Of these criteria, we gave the most weight to the last criteria, as adding predictor variables to a model will generally increase its predictive strength, even if only due to random chance. The buffer size with the lowest AICc value, lowest p value, and largest r 2 value was considered the optimal landscape extent at which IBI scores should be related to land cover. We then compared the optimal landscape extent for the vegetation-IBI with the optimal extent for the bird-IBI to evaluate whether the two communities were related to land cover at the same spatial extent.

To confirm that spatial autocorrelation was not responsible for the observed relationships between land cover and IBI scores, we factored out the influence of location and re-ran the models. We did this by first regressing UTM Easting and Northing coordinates on IBI scores for both IBIs and saving the residuals. We then used the saved residuals as response variables in place of raw IBI scores and repeated the backwards stepwise GLMs described above. We also re-ran the optimal models identified in the original model selection after factoring out the influence of location in order to parse the total variation in IBI scores into components explained by both location and land cover, components explained exclusively by location, and components explained exclusively by land cover.

We also modeled the proportion of undisturbed land (also 2/π × arcsine(square-root(x)) transformed) as a predictor of IBI scores. Using the criteria outlined above, we compared these simple one-predictor models with the best models identified by our backwards stepwise GLMs.

Results

The vegetation-based IBI scores ranged from 4.4 to 97.4 whereas the bird-based IBI scores ranged from 11.3 to 100.0, indicating that wetlands spanned a gradient from the reference condition to heavily disturbed. The two IBI scores were strongly correlated with each other (Pearson’s r = 0.80, p < 0.00001), revealing good agreement in assessment of wetland condition by the two indices.

IBI scores for both plant- and bird-based IBIs were significantly predicted by land cover data at all spatial extents considered (p < 0.000001), with a minimum of 63 and 60 % of variance in score explained by land cover for plant- and bird-based IBIs, respectively. The best model for each IBI had the highest r 2 value and the lowest AICc value (Tables 1, 2). Measures of F and p value were less useful in identifying the optimal model, as models differed in the number of predictor variables included, and thus in their number of degrees of freedom. For the plant-based IBI, the best model included the density of roads, and the proportion of the following land covers: agricultural land, open water, urban development, and emergent vegetation. For the bird-based IBI, the best model included road density, and the proportion of forest, rail lines, emergent and wet meadow vegetation. It should also be noted that land cover was better able to predict plant-based IBI scores (82 % variance explained) than bird-based IBI scores (70 % variance explained). The results of regression tree modeling were in general agreement with GLMs, and so detailed regression tree results are not presented.

Table 1 Results of GLM predicting vegetation-based IBI scores for 45 wetlands using land cover data extracted from a series of seven nested spatial extents
Table 2 Results of GLM predicting wetland dependent song bird-based IBI scores for 45 wetlands using land cover data extracted from a series of seven nested spatial extents

The predictor variables included in the model, the model fit, and the r 2 value differed depending on the spatial extent at which land cover was considered (Tables 1, 2). For the vegetation-based IBI, the 100 m spatial extent yielded the strongest and most significant predictions of IBI scores. In contrast, for the bird-based IBI, the 500 m buffer provided the best fit (Fig. 2).

Fig. 2
figure 2

Plot of the percent of variance in IBI score explained using land cover data extracted from a nested series of spatial extents. Models predicting vegetation-based IBI scores are indicated by open circles whereas those predicting bird-based IBI scores are indicated by closed circles. Note that vegetation-based IBI scores are best predicted using land cover date extracted from within 100 m, whereas bird-based IBI scores are best predicted using land cover data extracted within 500 m of the open-water boundary

The forest, agricultural, urban, and roads land cover classes all exhibited significant positive spatial autocorrelation (Table 3), confirming that the distribution of these land cover classes is clumped. Furthermore, the regressions of UTM Easting and Northing coordinates on the plant- and bird-based IBI scores were significant, each explaining about 40 % of the total variance in IBI scores (plant-based IBI = −176.34 + 0.0011 UTM_E − 0.000030 UTM_N, F 2,42 = 14.02, p = 0.00002; bird-based IBI = −1000.42 + 0.0012 UTM_E + 0.00011 UTM_N, F 2,42 = 14.29, p = 0.00002). Despite the strong relationship between IBI scores and location, re-running the backwards stepwise GLMs with the residuals from regressing location on IBI scores did not change our conclusions. Even with the influence of location factored out, land cover yielded strong and significant predictions of biotic integrity at every spatial scale assessed. Understandably, r 2 values were somewhat reduced (Table 4), but optimal models still explained about 50 % of the variance in biotic integrity using land cover data. The variables included by the backwards stepwise selection process were mainly a subset of those included in models predicting raw IBI scores within the same landscape extent (Supplementary Table S2). Partitioning the variance in IBI scores (the total sums of squares) into the components explained by location alone, by location and land cover jointly, and by land cover at the optimal spatial extent for each IBI reveals that location has nearly no independent relationship with IBI scores (Fig. 3). In contrast, nearly half of the sums of squares explained by land cover are explained by land cover independent of location.

Table 3 Results of tests for the random distribution of four important land cover classes in the Beaverhills watershed
Table 4 Comparison of the proportion of variance in biotic integrity explained (i.e., r 2 values) by land cover when land cover classes are regressed on raw IBI scores versus when they are regressed on IBI scores after factoring out any influence of location
Fig. 3
figure 3

Partitioning of variance in biotic integrity into components explained by location alone, jointly by location and land cover, and by land cover alone. The residual variance cannot be explained by terms in our models. The numbers represent the sums of squares explained by each term. In the figure depicting variance in the plant-based IBI scores, land cover is extracted from within the 100 m radius buffers, but for the bird-based IBI, land cover was extracted from within the 500 m radius buffers. The figure confirms that the relationship between biotic integrity and land cover is not merely spurious, resulting from the relationship between land cover and location. It also illustrates that very little of the variance in IBI scores can be explained by location independently of land cover

The proportion of undisturbed habitat within 100 m is a relatively poor predictor of vegetation- and bird-based (raw) IBI scores (Table 5). Using the proportion of undisturbed land extracted from larger buffers improves r 2 values and model fit, with the best fit obtained by using data within 500 and 1,500 m to predict scores for plant- and bird-based IBIs, respectively. Yet even at optimal spatial extents, the one-variable models do not perform as well as models that use detailed land cover data: one-variable models have lower r 2 values, lower F values, and higher AICc values than models using more detailed land cover data (Tables 1, 2, 5).

Table 5 Results of GLM predicting plant- and bird-based IBI scores using the proportion of undisturbed land within a series of nested buffers

Discussion

Land cover predicts biotic integrity

Our results offer support for the use of land cover as an indicator of biotic integrity estimated by both vegetation and bird communities. The variation in land cover within surrounding landscapes was able to explain the majority of variance in IBI scores (70–82 %), but the proportion of variance explained varied with the spatial extent of the landscape considered. We found that bird-based IBI scores were best predicted by land cover within 500 m wide buffers around each wetland, whereas plant-based IBI scores were best predicted by land cover within 100 m wide buffers around each wetland. Mack (2006) and Mita et al. (2007) found similarly high r 2 values in predicting vegetation-based IBI scores with measures of land cover surrounding depressional wetlands (72 and 65 %, respectively); however, neither study examined multiple spatial extents.

Based on our results, use of GIS data to complete region-wide assessments without site visits would introduce remarkably little error. The success of our land cover-based models supports previous studies suggesting that bird and vegetation communities are related to land cover variables, even over large spatial extents (e.g., Fairbairn and Dinsmore 2001; Brazner et al. 2007; Luoto et al. 2007). The relationship between biotic integrity and land cover is likely the result of multiple processes. For example, wetlands situated in disturbed landscapes suffer increased exposure to environmental stressors (Crosbie and Chow-Fraser 1999; Houlahan and Findlay 2004), increased nest predation (Phillips et al. 2003), increased invasion by exotic species (Galatowitsch et al. 2000), and reduced habitat connectivity, which has been associated with reduced richness of waterbirds (Guadagnin and Maltchik 2007). It is, therefore, not surprising that land cover surrounding a wetland is predictive of that wetland’s biotic integrity as measured from plants and wetland-dependent songbirds and shorebirds.

Several factors could be responsible for the slight differences between model-estimated and observed IBI scores, including simple environmental variability. In addition, there may be a time lag between when a change occurs on the landscape and when the biota responds noticeably to that change. This is especially problematic where the disturbance is expected to increase local extinction rates or decrease local re-colonization rates, as long lived residents will temporarily mask these effects (Findlay and Bourdages 2000). In such cases, land cover data may warn of impending impacts to wetland biota.

Although the variables included in the models predicting IBI scores varied with spatial extent and with the IBI considered (Tables 1, 2), certain variables emerged as consistently important. For example, road density or the relative cover of roads was an important variable in models predicting bird-based IBI scores for all of the spatial extents considered and for four out of seven of the models predicting vegetation-based IBI scores. Looking at wetland bird communities in agriculturally impacted areas of Minnesota, Whited et al. (2000) found that road density was an important predictor, and that road effects on bird communities were most pronounced at the 500 m spatial extent. Looking at wet meadow vegetation communities in Minnesotan wetlands, Galatowitsch et al. (2000) also identified road density as an important correlate of community composition, and found that the relationship was strongest at the smallest spatial extent that they considered (also 500 m). Thus, our results are in agreement with both of these studies in terms of the importance of roads and the spatial extent at which they are most influential on wetland bird communities.

Practically all models that predict wetland condition based on land cover data include some measure of the amount of disturbed land surrounding the wetland (e.g., Mensing et al. 1998; Mita et al. 2007; Wardrop et al. 2007; Sundell-Turner and Rodewald 2008). Six of our seven models predicting plant—IBI scores and three of our seven predicting bird—IBI scores included agricultural and urban land covers. Those models which did not include agriculture and urban covers all included forest cover. The combination of agricultural and urban land covers constitute the majority of land disturbed by human activity, whereas railway lines, cutlines, and roads constitute only small fractions of the landscape on an area basis. In contrast, forested land makes up the bulk of undisturbed land cover at the northern edge of the prairies where the Beaverhills watershed is situated. Because of the collinearity in land covers, the proportion of agricultural and urban land covers might be considered the inverse of forest along a gradient of disturbance. Thus, all the models predicting IBI scores include some measure of the amount of land disturbed by human activities, either directly (agriculture + urban) or indirectly (forest).

The only prominent difference in the subset of land cover classes that predicted vegetation-based IBI scores versus those classes that predicted bird-based IBI scores was that bird IBI models more often included railway lines and emergent vegetation. This suggests that these land cover classes may have a greater effect on birds than on plants. Birds are sensitive to noise disturbance, both because it masks predator arrival and alarm calls and because it interferes with songs related to territory defense and mate attraction (e.g., Slabbekoorn and Ripmeester 2008). Certainly railway lines may act as major sources of noise. Similarly, birds rely on emergent vegetation zones for nesting habitat (Delphey and Dinsmore 1993), and thus the area of emergent vegetation available could have an important effect on bird communities. It is unclear to us why the area of emergent vegetation was not predictive of the biotic integrity of wet meadow plants. Possibly the scale of the imagery did not match the scale at which the vegetation was sampled in the field.

Spatial extent influences relationship between land cover and biotic integrity

The appropriate spatial extent at which to evaluate land cover data is contingent on what taxon forms the basis of assessments of biotic integrity. The importance of spatial extent has been noted before (e.g., Turner et al. 1989; Wu 2004; Houlahan et al. 2006; Brazner et al. 2007), yet previous efforts to develop GIS-based assessment tools typically test only a single spatial extent and rarely provide a biological rationale for its selection. In stream or riverine wetland studies, land cover data is often extracted from within the watershed (e.g., Miller et al. 1997; Falcone et al. 2010) or at the scale of ridge tops and valleys (Wardrop et al. 2007), but in small depressional wetlands with complex surface–groundwater interactions, topography is not necessarily the appropriate basis on which to make decisions about landscape extent (Devito et al. 2005). Previous authors acknowledged the absence of an obvious landscape extent appropriate for depressional wetlands (e.g., Brown and Vivas 2005; Reiss and Brown 2007). Typically, authors either select landscape extents arbitrarily or adopt values published in the literature. Studies that we reviewed examined buffers ranging from 100 m to 3 km wide, which is also the range over which we found models capable of predicting IBI scores using land cover data, but the most frequently adopted extent in the studies we reviewed was 1 km. In our study system, this was larger than the optimal spatial extent for either the vegetation or the bird community.

Although all seven spatial extents that we considered yielded statistically significant models, their predictive strength varied. Changing extents is known to influence certain landscape pattern metrics (Turner et al. 1989; Wu 2004), but we believe that variance in the strength of relationships between biotic integrity and landscape composition across spatial extents reflects the real spatial nature of the relationship between biota and their surroundings. In other words, the predictive strength of our models is greatest when the scale of analysis approximates the operational scale of the taxon in question (Wu 2004). Plant-based IBI scores were best predicted at the smallest spatial extent we considered (within 100 m radius buffers), whereas bird-based IBI scores were most strongly and significantly related to land cover within 500 m radius buffers. We attribute this discrepancy to the fact that unlike stationary, passively dispersing plants, birds are mobile and actively select their habitat. In other words, they have a larger functional grain size (sensu Romero et al. 2009). Most of the passerines important to the bird-based IBI have breeding territories less than 1 ha, but many area-sensitive species like Wilson’s Pharalope (Phalaropus tricolor), Black Tern (Childonias niger), Marbled Godwit (Limosa fedoa), and American Avocet (Botaurus lentiginosus), appear to be using 100 ha areas (Dechant et al. 2001; Dechant et al. 2002; Zimmerman et al. 2002; Dechant et al. 2003), which because of the small size of our wetlands, corresponds with a landscape with a buffer radius of about 500 m (the area encompassed by the 500 m buffers ranged from 91.4 to 147.1 ha, with a mean of 112.6 ha). The idea that the optimal landscape extent for assessing biotic integrity might be species-specific is supported by other bird studies. For example, Tozer et al. (2010) found that both Marsh Wren and Least Bittern abundances were related to the proportion of wetland in the surrounding landscape, but at different spatial extents. They suggest that the positive relationship between wetland habitat and bird abundance is due to a greater influx of dispersing individuals in landscapes with a greater proportion of wetland habitat, although such a mechanism would apply equally to plants through the dispersal of their propagules, either by wind, water, or animal vectors. The area of wetland cover within our buffers did not emerge as a generally important predictor of IBI scores, although measures of natural habitat (e.g., forest cover) or its complement (e.g., agricultural and urban land cover) were consistently important in our models and could also influence dispersal. Regardless, the critical influence of spatial extent on the strength and significance of models relating biota to the surrounding landscape is clear. Our findings warn against the arbitrary selection of landscape extents, especially in depressional wetlands where appropriate extents are not dictated by topography.

Spatial autocorrelation

We were surprised that land cover was able to explain more than half the variance in IBI scores, even 3 km away; especially for non-mobile plants when soil storage capacities in the region are so high that run-off is minimal (Devito et al. 2005) and the watershed transportation capacity is very low (i.e., the wetlands are fairly isolated). One explanation we considered was that land covers were positively spatially autocorrelated such that, for example, the abundance of forest 3 km away is able to predict the biotic integrity of a wetland because it is predictive of the amount of forest adjacent to the wetland, not because it is directly affecting communities within the wetland across such a large distance. In other words, we were concerned the observed correlation between land cover at large spatial extents and IBI scores could be spurious. Positive spatial autocorrelation is common in nature (Legendre 1993), and indeed the four land cover classes most commonly included in our models exhibited significantly clumped distributions (Table 3).

To confirm that our conclusions about the capacity of land cover to predict IBI scores and the optimal spatial scales at which land cover should be considered were not merely the result of positive spatial autocorrelation, we factored out the influence of location on IBI scores and then re-ran the models. This process revealed a significant East–West gradient in IBI scores for both the plant- and bird-based IBIs: most likely the result of the presence of the City of Edmonton in the West of our study region and the national and provincial parks in the East (Fig. 1). Earlier work in the Beaverhills watershed (e.g., Rooney and Bayley 2012; Wilson and Bayley 2012) determined that urban wetlands are typically more disturbed than those in agricultural or protected areas, so this gradient in biotic integrity was not unexpected. Yet, when we re-ran the GLM after factoring out the influence of location, our conclusions were unchanged. Significant models predicting IBI scores using land cover data were generated for all seven spatial extents, and the optimal spatial extent for predicting the plant-based IBI remained smaller than that for predicting the bird-based IBI (Supplementary Table S2).

Regressing land cover on IBI scores after factoring out any influence of location is a conservative approach to confirming our results, as it excludes any correlation with biotic integrity shared by land cover and location, regardless of the underlying mechanism (Fig. 3). Our goal in this study was not to tease apart mechanisms driving the relationship between estimates of biotic integrity and land cover. Rather, we sought to answer two questions fundamental to the development of GIS-based wetland assessments: (1) whether a predictive relationship between land cover and biotic integrity in wetlands of the Beaverhills watershed exists; and (2) at what spatial extent such relationships are best estimated. The fact that our main conclusions were unchanged regardless of the exclusion or inclusion of location’s influence on IBI scores suggests strongly that remotely sensed land cover could serve as the basis for reasonably accurate region-wide assessments of wetland condition.

“Undisturbed” models yield poorer predictions of biotic integrity

In rapidly developing regions or when year-to-year variability in budgets and conservation opportunities impede the immediate implementation of province-wide reserve networks, simple models for identifying conservation priorities out-perform more detailed models (Meir et al. 2004). Thus, single variable measures of wetland condition, like percent undisturbed land within 100 m, are of obvious appeal. This is especially true where management units encompass a patchwork of land cover maps, each created with different data sources, at different resolutions, constructed using different land cover classification approaches. In such cases, reducing all maps to a common denominator (a binary system of disturbed and undisturbed cover types) may overcome many of the challenges presented by integrating such variable components. Previous work has revealed evidence that single-variable models can be predictive of habitat quality. For example, looking at prairie pothole wetlands in North Dakota, Mita et al. (2007) found that the percent cover of grasslands explained 72 % of the variance in a vegetation-based IBI score.

Unfortunately, our results suggest that the amount of undisturbed land cover is a relatively poor surrogate for more detailed models of biotic integrity in depressional wetlands. At all spatial extents, the “undisturbed” model explained substantially less variance in IBI score than the models based on more detailed classifications of land cover data. Allowing for optimization of landscape extent, the best we could achieve using the “undisturbed” model was to explain 63 % of the variance in plant-based IBI scores and 57 % of the variance in bird-based IBI scores. In contrast, our best models using detailed land cover data explained 82 and 70 % of the variation in IBI scores for plants and birds, respectively. If we constrain the spatial extent to the 100 m buffer width (as was recommended by policy makers within our study region), there is an even greater loss of predictive capacity: only 51 % of variance in plant-based IBI scores and 38 % of variance in bird-based IBI scores is explained by the simple models at the 100 m extent. This surprised us, especially for the plant-based IBI, as the backwards stepwise modeling with more detailed land cover data identified the 100 m buffer width as the optimal landscape extent. Thus, it is important to consider the appropriate extent of the landscape, regardless of whether land cover is considered holistically or is reduced to a single representative measure like the proportion of undisturbed habitat.

In terms of the balance between model accuracy and complexity, the AICc values associated with the “undisturbed” models were much larger than those associated with the models using detailed land cover data, despite having one-fifth the number of predictor variables (Tables 1, 2, 5). Thus, we observe that although single predictor variable models can predict biotic integrity (i.e., the simple models were statistically significant, with p < 0.05), incorporating more detailed land cover data substantially improves both the strength and accuracy of IBI score predictions. Furthermore, measures like the proportion of undisturbed land cover are typically derived from existing GIS datasets that contain more detailed information about land cover. In such instances, the accuracy of wetland assessments could be substantially improved by using the detailed land cover data to estimate IBI scores rather than using it to calculate simple proxies like the proportion of undisturbed land cover.

The ability to use remotely sensed data in place of intensive assessments that require site visits should inform land use planning and the identification of areas of high conservation or restoration potential (Sundell-Turner and Rodewald 2008). Without understanding the mechanisms by which land cover and biotic integrity of wetlands are connected, GIS-based assessments will not be diagnostic of the cause of biological impairment (King et al. 2005). Rather, in areas where GIS-based assessment suggests impairment, more intensive field-level assessment will be required to confirm and to identify the cause of impairment (Brooks et al. 2004). Thus, we envision a system where GIS-based and field-level assessments are used in concert to facilitate a flexible, adaptive approach to wetland management.

Despite extensive evidence that different taxa interact with their surroundings at different spatial extents, most studies relating the abundance or diversity of biota to land cover explore only a single spatial extent. Especially in the case of depressional wetlands, justification for using a given spatial extent is usually poorly articulated. Due to the important influence of spatial extent on model performance, we recommend that the optimal landscape extent be determined through a calibration process such as the one we undertook, wherein biotic integrity is modeled using land cover data extracted from a variety of spatial extents within a test area before that model is applied to a larger region. Such calibration efforts will help ensure that wetland assessments made using remotely sensed data provide reliable estimates of actual wetland conditions.

Our models have been calibrated within the Beaverhills watershed, so we are able to successfully predict biotic integrity using land cover data within this region. The next logical question becomes how far beyond the calibration region will our models hold before we risk over-stepping their predictive capacity? One important constraint in evaluating this question is that IBIs are also regionally constrained in their application (Karr 1993; Mack 2007). The IBI scores we sought to predict are validated for the Aspen Parkland Ecoregion (the transition zone between the northern prairie and southern boreal habitats), so it stands to reason that our landscape models should be evaluated using the same biotic integrity indices measured across the Aspen Parkland in order to identify the limits to extrapolation.