Introduction

Anthropogenic biomes are directly influenced and shaped by human activities (Ellis and Ramankutty 2008). The vast majority of ice-free land and most of the global tree cover extent on earth are composed of anthropogenic biomes, yet these areas receive comparatively less attention from ecologists and other scientists (Ellis and Ramankutty 2008). Furthermore, many of Earth’s largest cities occupy watersheds in forest biomes that are also adjacent to invaluable freshwater and saltwater ecosystems (Schneider et al. 2010). These aptly named forest-urban watersheds are frequently characterized by rapid shifts between “green” land cover and developed areas containing a high proportion of impervious surfaces (Inkiläinen et al. 2013). In this case, “green” refers to all natural or man-made land cover that moderates runoff, mediates temperatures, mitigates soil loss, and provides habitat; such as trees, shrubs, grass, wetlands, and retention ponds at all spatial scales (i.e. green infrastructure, as in Tzoulas et al. 2007). Dense populations of people and large areas of impervious surfaces deteriorate water quality, increase flood frequency and/or severity, and generally decrease the quality of life in most forest-urban watersheds (Paul and Meyer 2001; Walsh et al. 2005; Inkiläinen et al. 2013) (Fig. 1).

Fig. 1
figure 1

Pictures of Miller Creek in the Lincoln Park area of Duluth, MN, USA during normal conditions (top panel) and after a heavy rainfall event (bottom panel). The landscape is representative of a typical forest-urban watershed that contains rapid transitions from green infrastructure (e.g. small patches of trees or grass and individual trees) to impervious surfaces (e.g. flat paved areas and rooftops). Small patches of trees and shrubs are difficult to map with moderate- to low-resolution data, yet may significantly influence water quality and quantity. Google collected the picture in the top panel during 2014 and Todd Carlson from the City of Duluth collected the picture in the bottom panel during a heavy rainfall event in 2007

The potential benefits of green infrastructure in mitigating water pollution and flooding in forest-urban watersheds are well known (e.g. Gill et al. 2007), but there are numerous unresolved issues concerning how the spatial arrangement and quantity of green infrastructure observed at multiple spatial scales (i.e. grain or resolution and extent) are best able to meet quality of life, development, and environmental goals (Felson et al. 2013). Determining the precise quantity and spatial distribution of green infrastructure necessary to achieve maximum ecological services and minimum costs is of paramount interest to scientists, planners, politicians, government agencies, and the local citizenry (e.g. McPherson et al. 2011). This is especially true in light of increasingly stricter regulations concerning land management and development issued by government agencies. To achieve the promise of green infrastructure, we need to (1) construct reliable and detailed maps of existing green and developed infrastructure at multiple spatial resolutions and extents, (2) use corresponding empirical models to link mapped infrastructure with different response variables in nearby aquatic ecosystems (e.g. water quality and/or quantity), and (3) construct predictive simulation models for different planning, weather, and climate scenarios.

The focus of this paper is step 1, although substantial progress has been made in achieving all three steps with moderate- to low-resolution data at the watershed scale, where abundant research links increasing proportions of impervious surfaces with poor water quality and more frequent or severe flood events (e.g. Paul and Meyer 2001; Haidary et al. 2013). These studies provide excellent conceptual advancements and adequate recommendations concerning the proportion, and in some cases broad-scale spatial configuration, of developed versus green infrastructure from the perspective of an entire watershed at low spatial resolutions. However, the scope and utility of the findings are limited because sufficient land cover data are not available for depicting finer scales, rapid spatial changes, and also the complex three-dimensional structure of green infrastructure that typify forest-urban watersheds at local scales on a block-by-block basis. For example, green infrastructure is not only important at watershed scales, but is also germane at neighborhood and site-specific extents—smaller patch sizes of 100–200 m2 predominate in these areas and are not precisely mapped by even moderate-resolution data. The cumulative effects of green infrastructure at local extents may be significant when summarized at the watershed scale, but without high-resolution land cover products we cannot even consider fine-scale features across entire watersheds because these features are not observable.

Relatively recent (i.e. last ~10–15 years) and ongoing advances in remote sensing have allowed computerized mapping of detailed land cover and land use maps at spatial resolutions of 1–2 m or less that accurately represent fine-scale patchiness in heterogeneous forest-urban landscapes (e.g. Hodgson et al. 2003; Zhou and Troy 2008). These land cover data are not widely available and rarely link high-resolution land cover data with water quality and quantity. Moreover, no study that we are aware of has accurately mapped the three-dimensional structure of green infrastructure at high resolutions across an entire watershed. This lack of fine-scale observation is significant. For example, the canopies of urban forests and trees extend out over impervious surfaces and are capable of intercepting up to two-thirds of precipitation (Asadian and Weiler 2009), thus mitigating the detrimental effects of the underlying impervious surface on water quality and quantity (King and Locke 2013). Mapping the percent canopy overlapping impervious surfaces could be a useful tool in developing recommendations of where green infrastructure should be added or possibly removed, especially when such information is incorporated with other land cover data, ecological expertise, and watershed modeling. Here, we assemble a stack of high-resolution geospatial data from multiple sources for a complex forest-urban watershed in Duluth, MN, USA and evaluate the use of object-based image analysis in (1) generating a 9-class land cover map and (2) mapping the percent tree canopy overlapping impervious surfaces. We also conduct some basic area comparisons with three key analog classes from the 2011 NLCD.

Methods

Study area

Miller Creek is a cold-water trout stream that occupies a forest-urban watershed containing a patchy landscape of green infrastructure and development. The watershed covers about 2500 ha of land at the southern border of the boreal forest biome in Duluth, MN, USA (pop. of 86,211 in 2012), draining into the Saint Louis River and eventually into Lake Superior. Wetlands, ponds, streams, shrubs, deciduous trees, coniferous trees, bare ground, and development exist in a complex mosaic across the spatially heterogeneous landscape of the watershed. Rising water temperatures, sediment and turbidity levels, chloride concentrations, and mercury levels in fish are all problematic in Miller Creek, which has been designated as “impaired water” by the Minnesota Pollution Control Agency (Minnesota Pollution Control Agency 2012).

Geospatial data acquisition

We assembled a diverse geospatial dataset for mapping the spatially complex nature of land cover in the Miller Creek watershed, including data from a suite of active and passive aerial remote sensing systems and existing online archives (Table 1). Light Detection and Ranging (i.e. LiDAR) is a form of active remote sensing where laser pulses generated from a sensor are used to detect information about ground elevation, buildings, and vegetation types (Asner et al. 2011). It is particularly useful for differentiating between buildings, trees, and bare earth surfaces. Four-band aerial photography is widely available for most urban areas during leaf-off and leaf-on conditions. This is a form of passive remote sensing where the sensor records naturally reflected wavelengths from the visible (red, green, and blue) and near-infrared portions of the electromagnetic spectrum. It is particularly useful for differentiating between coniferous/deciduous trees and green/developed areas. USGS 10-m digital elevation models (DEMs) are available in existing online archives. This type of DEM is derived from many different types of “best-available” source data. 10-m DEMs offer the advantage of fewer surface anomalies from bridges, roads, and other artificial “dams” that plague high-resolution DEMs. Soil Survey Geographic database (SSURGO) data are derived from field surveys and interpretations of aerial photographs, offering the most detailed level of soils information.

Table 1 Metadata and processing information for geospatial data collected and used in the object-based image analysis approach

Pre-processing and extracting derivatives from geospatial data

Using 38 ground control points evenly distributed around the perimeter and interior of the watershed, we registered all raster-based data to the 2011 leaf-off orthophoto. Second order polynomial transformations and bilinear resampling techniques were used for registration procedures because of the relatively complex terrain in the watershed, continuous nature of the input rasters, and disinterest in directly extracting biophysical data. We achieved RMSEs of 1–3 pixels for the high-resolution data and less than 0.5 pixels for the low-resolution data.

Invaluable spatial information is often lost if simply relying on the original bands for LiDAR data, aerial photography, and 10-m DEMs. Calculating derivatives from geospatial data can significantly enhance land cover classification quality, especially when using object-based approaches (Guan et al. 2013). Based off recommendations from the literature and our own empirical observations, we calculated a multitude of derivatives from the geospatial data for inclusion in the object-based image analysis classification. Pre-processing and derivative calculations provided 23 layers of geospatial data for inclusion in the object-based image analysis (Table 1).

Developing maps for land cover and percent canopy over impervious surfaces

The Feature Analyst 5.0 extension in ArcGIS 10.1 was used to apply an object-based image analysis (OBIA) technique (Opitz and Blundell 2008) for sequentially extracting nine land cover classes from the 23 layers of assembled data. OBIA techniques provide a superior approach for classifying high-resolution data encompassing the complex landscapes of forest-urban environments because shape and texture are considered in addition to values from individual pixels (Chen et al. 2009). Classes included in the approach were water, conifer tree, deciduous tree, building, grass, impervious, wetland, shrub, and bare ground. We applied supervised techniques predicated on training data capturing a range of spatial, textural, and spectral variability for each class, using iterative refinements to improve some of the most challenging classes (e.g. up to 8 iterations for wetlands and shrubs). As few as 3–4 training polygons may suffice for classification of simple classes (Opitz and Blundell 2008), but many of our classes displayed a complex range of variables (e.g. size, shape, texture, etc.) and therefore our training polygons ranged between 30–50 for each class. We used squaring algorithms for anthropogenic objects and a series of smoothing and aggregation algorithms for post-classification processing that were all embedded in the Feature Analyst 5.0 software. After classification, we applied a majority filter to assign values to small remaining areas of unclassified pixels, which resulted in less than 0.5 % of the watershed being unclassified.

Each land cover class was generated individually from a vertical aerial perspective, which produced cumulative class estimates of area from all land cover classes that exceeded total area of the watershed because of spatial overlaps between the classes. Most of the overlap was a result of tree branches (i.e. canopy) overhanging impervious areas, which presented an opportunity to map the area of canopy overlapping impervious surfaces by clipping tree classifications to an impervious layer (i.e. flat impervious surfaces and buildings). Visual inspection of our geospatial data indicated that LiDAR, leaf-off/on aerial imagery, and corresponding derivatives highlighted many of the areas containing canopy overlap with impervious surfaces. We created new training sets for these areas and applied object-based image analysis with Feature Analyst 5.0 to create an impervious class occupying the space under overhead tree canopy. This new class was merged with the previously mapped building and impervious classes. The merge was imperfect and included some gaps between the two classes. We filled the gaps and squared the edges before clipping out any trees (i.e. from the high-resolution deciduous and conifer classes) overlapping the impervious layer. The clipped out trees represent the map of canopy overlapping impervious surfaces.

Accuracy assessments and class comparisons

Accuracy of the high-resolution land cover classification was assessed using contemporary insight and methods outlined by Olofsson et al. (2013) and Mas et al. (2014) as opposed to assessment techniques traditionally used in remote sensing (e.g. Congalton and Plourde 2002). A stratified random sampling scheme of 33 points per class was implemented in a geographic information system. We discarded points randomly placed on the edge (i.e. within 2 m or less) of two or more land cover types. The remaining 269 points were assessed for accuracy by comparisons with available aerial imagery (Google maps, NAIP imagery etc.) in a GIS and/or visual confirmation from field visits when necessary. We created a validation raster based on the assessed points and added it to accuracy flow models (i.e. “calculate_matrices” and “calculate_accuracy_indices_withCI” models) containing the high-resolution land cover raster in DINAMICA EGO software. This allowed us to generate unbiased estimates of accuracy and area at the 95 % confidence level (i.e. similar to methods outlined by Mas et al. 2014). A similar technique was applied to the map of tree canopy overlapping impervious surfaces, but a total of 100 points were randomly distributed across each of two classes (i.e. canopy overlap and non-canopy overlap of impervious surfaces). The sample points were increased because of the extreme variability in percent canopy overlapping impervious surfaces across the watershed. We restricted the non-canopy overlap to within 20 m of existing impervious surfaces, which corresponds to the maximum distance of potential tree overlap associated with patches of very large trees. It was necessary to discard 19 points that straddled class borders, which provided 181 points for the accuracy assessment. 2011 NLCD data was also downloaded for the watershed and assessed with similar techniques for comparative purposes. Small NLCD classes were merged together. We sampled 30 points out of the 10 remaining classes in the watershed and again removed points straddling class edges, which left 322 points for the assessment. The main focus on the NLCD pertains to unbiased area estimates of three key analog classes that match up with the high-resolution land cover data (i.e. water, conifer, and deciduous). Hence only overall accuracy is presented in addition to adjusted area estimates from the three classes.

Results

Accuracies

Unbiased estimates of overall accuracies for both high-resolution mapping techniques hovered around 90 % (Tables 2 and 3). The high-resolution land cover map was 90.40 ± 4.58 % and the percent tree canopy overlapping impervious surfaces was 92.69 ± 5.10 %. Individual unbiased accuracies (Table 2) for many classes in the high-resolution land cover map (Fig. 2) were near 90 % or more. Water, conifer, deciduous, building, and impervious classes exhibited the highest accuracies. Grass, wetland, shrub, and bare ground classes were less accurate. Individual unbiased accuracies (Table 3) of the two classes present in the percent canopy overlaying impervious surfaces map (Fig. 3) were 80–90 % or more, except for a low producer accuracy for canopy overlap.

Table 2 Unbiased accuracy and area statistics for the high-resolution land cover classification in the Miller Creek watershed
Table 3 Unbiased accuracy and area statistics for the percent tree canopy overlapping impervious surface classification in the Miller Creek watershed
Fig. 2
figure 2

High-resolution land cover maps of Miller Creek based on object-based image analysis

Fig. 3
figure 3

High-resolution map of tree canopy overlapping impervious surfaces (i.e. buildings and flat impervious surfaces) in Miller Creek based on a stack of multiple geospatial data sources and object-based image analysis. Leaf-off imagery from 2011 is in the background. Notice the high proportion of overlap throughout the old neighborhood depicted in the lower left subset

Unbiased area estimates and watershed metrics

Of the 2,580.30 ha mapped at a high resolution in the watershed, unbiased area estimates produced generally narrow ranges for each class (Table 2). Water comprised 8.90–10.10 ha (0.34–0.39 %) of the watershed. Trees comprised 1,104.86–1,334.91 ha (42.82–51.73 %) of the watershed. Buildings and other impervious flat surfaces comprised 485.48–583.06 ha (18.81–22.60 %) of the watershed. Tree canopy overlapped 9.71–55.97 ha in the watershed (0.38–2.17 %) or 5.48–14.6 % of impervious surfaces (Table 3). Compared to the high-resolution land cover classification, the 2011 NLCD overestimated the area of water in the watershed and underestimated the area of conifer and deciduous trees (Tables 2 and 4).

Table 4 Unbiased overall accuracy and area statistics for three key 2011 NLCD classes

Discussion/Conclusions

Both the high-resolution land cover and canopy overlapping impervious surface maps displayed relatively high overall accuracy consistent with other high-resolution mapping endeavors (e.g. Mathieu et al. 2007; Zhou et al. 2008; Myint et al. 2011), some widely used moderate-resolution products such as NLCD and Vegetation Change Tracker (e.g. Stueve et al. 2011; Wickham et al. 2013), and established thresholds for acceptable accuracy in land cover products (Shao and Wu 2008). The comparatively narrow adjusted area ranges of several individual high-resolution land cover classes demonstrate a higher degree of confidence in these respectively mapped areas. Not surprisingly, land cover classes containing vertical structure were more reliably classified than other land cover classes, except for water. Indeed, it is challenging to discern the spatial and spectral differences between shrub, grass, wetland, and bare ground because the three dimensional capabilities of LiDAR and seasonal aerial snapshots offer fewer advantages. Platforms with increased spectral and/or temporal resolutions may improve these underperforming classes. The range of the adjusted area estimate for tree canopy overlapping impervious surfaces is wider than many of the most accurate classes from the high-resolution land cover map, but much narrower than the ranges for the three key NLCD classes. This indicates the canopy overlap output is quite useful in small spatially heterogeneous watersheds and a worthwhile endeavor, but that some improved accuracy and confidence would be beneficial. Increasing pulse density from LiDAR sensors may provide an opportunity to improve this classification because the LiDAR used here failed to penetrate a few areas of dense tree canopy and slightly underestimated the extent of individual tree canopy with complex edges. The latter two issues probably explain why the user accuracy for canopy overlap was so low.

Scale selection errors are inherent in all remotely sensed land cover products and are highly dependent on the spatial resolution of data used in the analysis, spatial extent of the study, and the functional scales of processes being investigated (Shao and Wu 2008). For example, a moderate-resolution land cover map such as the NLCD data may contain respectable overall accuracy for comparatively large county- and state-wide analyses, yet simultaneously fail to capture important hydrologic, ecologic, social, and other features on select landscapes and watersheds of interest embedded at local scales. Features existing at more local scales may nevertheless be important in aggregate across the entire study area. Indeed, the patchy and highly variable nature of green infrastructure in forest-urban watersheds fits the aforementioned criteria and is likely difficult to detect with a high degree of confidence when using moderate- to low-resolution data. Comparisons between three key analog NLCD classes and the high-resolution land cover map quantitatively capture some of these pitfalls. For example, water is one of the most easily mapped classes in remote sensing and one would expect significant overlap between the adjusted area ranges of water for both the NLCD and high-resolution land cover map. However, the NLCD data greatly overestimates the area of water in the watershed (i.e. the lowest NLCD estimate is about double the maximum area estimate from the high-resolution map). A highly probable explanation of the disparity is that surface water comprises a small proportion of Miller Creek watershed and is interspersed as small patches with wetlands and shrubs, which makes it difficult to map at lower spatial resolutions. A similar phenomenon probably occurs with key green infrastructure, such as trees, but the trend with trees is difficult to directly extract from the NLCD data because various developed NLCD classes include some trees and there is a small “mixed” class of trees. These issues probably explain why the 2011 NLCD is significantly underestimating the adjusted area of trees in the Miller Creek watershed (506.57–691.28 ha versus 1104.86–1334.91 ha). For example, the top left subset panel of Fig. 4 contains only three broad “developed” classes, but the high-resolution land cover map to the right reveals a much more complex landscape consisting of conifer trees, deciduous trees, roads, buildings, grass, and shrubs in a variety of patch sizes and shapes.

Fig. 4
figure 4

Comparison of 30-m NLCD and 0.5-m high-resolution maps of the Miller Creek watershed from 2011. Note how the 50-m subset appears to be a simple landscape with broad shapes and a mere three classes whereas the high-resolution 50-m subset portrays a relatively complex landscape with much more variability

Overall, our analysis demonstrates that objected-based image analysis techniques used in conjunction with high-resolution geospatial datasets in forest-urban watersheds can produce reliable mapping products suitable for scientific analysis and inclusion in the decision-making processes of managers. These high-resolution products could fill an important information gap in NLCD data and other comparable moderate- to low-resolution products that struggle to quantify fine-scale changes across large areas. More specifically, the quantity and spatial distribution of green infrastructure detected by high-resolution products could exert significant influences on water quality and quantity that are difficult, if not impossible, to detect with moderate- to low-resolution land cover products. Achieving an increased level of detail with high-resolution maps allows careful empirical evaluations and modeled simulations of the relationships between green infrastructure and both water quality and quantity. Of paramount interest is the exploration of the potential shifts in the strength and nature of these relationships at an array of ecologically meaningful scales ranging from local neighborhoods to entire watersheds. Furthermore, the success of our approach and recent successes in the automated classification of geospatial data (e.g. Huang et al. 2010) suggest automation and efficient applications of OBIA-based high-resolution mapping are plausible.

As expounded upon by Nixon (2009) in describing one of H.T. Odum’s intellectual contributions from the systems approach, the study of nature requires interplay and study between observations using “microscopes” and “macroscopes” and the distinct realms they each focus on. Our successful development of high-resolution products in Miller Creek provides evidence that finer scale limits can be pushed with remote sensing technologies and contemporary analytical approaches. Automation and cost-effective applications of high-resolution mapping across multiple watersheds and broader scales could regularly provide a more comprehensive “microscopic” view of landscapes in the ecologist’s toolkit to complement the more traditional macroscopic use of remote sensing. Moreover, theoretical ecologists as well as physicists and other scientists point out that there is no single “correct” scale of observation for nature, yet they also offer that macroscopic behaviors often provide predictability from among more unpredictable lower levels of ecological hierarchies (O’Neill et al. 1986, Levin 1992). In advocating for the wide array of research applications made possible by contemporary high-resolution mapping technologies, we are setting the stage to allow comprehensive evaluations of the impacts relatively small features distinguishable at fine scales exert on watersheds, and this is the exciting advance for ecologists and managers to consider as an opportunity.