Introduction

Stream ecologists have long struggled with defining what they feel is the correct scale for their research. It has been effectively argued that streams should be treated like aquatic landscapes (Wiens, 2002), and like all of landscape ecology, the appropriate scale of sampling and analysis depends on the research question. But the exact link between question and scale has never been very clearly elaborated. Variation in the processes and patterns of the stream environment has an obvious spatial and temporal hierarchy (Frissell et al., 1986), with ramifications for component communities such as the macroinvertebrates (Boyero & Bailey, 2001; Parsons et al., 2003; Townsend et al., 2004). Minshall and Peterson (1985) proposed that at small spatial (mm to cm) and temporal (ms to h) scales, the interrelated hydrodynamics and substrate of a stream determine the distribution of individuals, and by extension, the structure of communities. At larger (km) scales of stream reach and basin, factors such as discharge, channel size, riparian vegetation, and both surficial and bedrock geology often are more important in determining community structure (e.g., Yates & Bailey, 2006). Longitudinal, cumulative effects of these larger scale factors give rise to the longitudinal zonation of rivers (Vannote et al., 1980). It is also at this larger spatial scale of the reach or the entire stream that bioassessment usually asks the question of whether or not human activity has affected the stream.

Assessing the effects of human activity on ecosystems at more than one scale may yield complex or even contradictory results. For example, bank erosion at a given point in a stream may be caused by agricultural activity at a drainage basin scale, channelization at the reach scale, cattle access at a very small scale, or a combination of all three (Imhof et al., 1996). The ecological effects of the erosion may only be measurable at a very small scale or may alter the structure of the entire downstream ecosystem. The complexity of scales in the environment and effects on the biological communities make understanding the stream ecosystem difficult, and environmental assessment and remediation very challenging (Hawkins et al., 1993). Sometimes even the most basic question, “Is sampling one reach of a stream adequate to characterize the stream for environmental assessment?”, is impossible to answer.

In this study, we quantified the hierarchical variation in the structure of macroinvertebrate communities (including their pollution tolerance) and their substrate environments in 10 major tributary streams of the 6th order Thames River in temperate, northeastern North America. We then correlated the structure of the macroinvertebrate community with its multi-scale environment, and determined from this an efficient strategy for sampling streams in bioassessments.

Methods

The Upper Thames River Catchment Area (UTRCA) encompasses 3,500 km2 in southwestern Ontario, Canada. The climate is humid continental, with most annual precipitation in either April or November. The gently rolling area has mostly sandy, loamy soils, and much of the extensive corn, soybean, and winter wheat croplands have subsurface tile drainage, with attendant water quality problems in the receiving streams (Barton, 1996). Ten, primarily agricultural, 4th order streams in UTRCA were selected for this study of the hierarchical structure of stream ecosystems (Table 1). Two geomorphologically distinct reaches within each stream (UP and DN), at least 1 km and at most 3 km apart on the stream channel were chosen based on accessibility to private lands. Three, longitudinally consecutive riffles (UP, MD, DN) were sampled within each reach. Three, parallel points within each riffle (LT, MD, RT) were sampled as described below.

Table 1 Major tributary streams of the Upper Thames River that were sampled in this study

Each of the 20 reaches (10 streams × 2 reaches/stream) was visited in random order between 29 June and 20 July 1998. At a reach, sampling was carried out beginning with the downstream riffle and working upstream to minimize longitudinal disturbance. A 5 m (downstream to upstream) kick sample was taken at the three parallel points within each riffle with a 500 μm D-net. The washed debris from each kick sample was preserved in 70% ethanol and then physical and chemical habitat measurements were made. Substrate was visually assessed for each kick sample point as per cent bedrock, cobble, pebble, gravel, sand, silt, and clay. None of the points had bedrock substrate so analyses were limited to boulder-sized and smaller particles. Following macroinvertebrate sampling and substrate assessment, each reach was scored with a modification of the EPA Habitat Assessment for high gradient streams that considered available cover, embeddedness of the substrate, velocity/depth regime, channel alteration, scouring and deposition, frequency of riffles, bank stability, bank vegetation, and riparian vegetation (Plafkin et al., 1989).

Macroinvertebrate samples with their associated debris were subsampled using a 100 cell Marchant box (Marchant, 1989), such that a minimum of 200 individuals were used to calculate the diversity (using S, taxonomic richness), tolerance (using BI, Hilsenhoff’s (1987) biotic index), and proportional composition of the community at a given spot in the riffle. The invertebrates were identified to genus using Merritt and Cummins (1995), Thorp and Covich (1991), Weiderholm (1983), and Wiggins (1995). All Chironomidae were mounted on glass slides using CMC-9AF mounting medium from Master’s Chemical Company, Inc. Worms (Oligochaeta, Turbellaria), mites (Arachnida), and leeches (Hirudinea) were identified to Class only.

We summarized the variation and covariation of substrate properties among the 180 observations (10 streams × 2 reaches/stream × 3 riffles/reach × 3 kick points/riffle) with Principal Component Analysis (PCA) of the covariance matrix of the proportion of the substrate in different size categories. Examination of a scree plot indicated that the first two gradients were interpretable, so Principal Component (PC) scores for the first two axes were calculated for each observation. Variation and covariation of EPA habitat assessment scores from 20 observations (10 streams × 2 reaches/stream) were also summarized with PCA of the covariance matrix of the nine habitat descriptors. A scree plot showed three interpretable gradients in habitat assessment descriptors, so PC scores for the first three axes were calculated for each of the 20 observations.

Nested Analysis of Variance (ANOVA) was used to partition variability in S and BI of the macroinvertebrate communities among streams, reaches within streams, riffles within reaches, and points within a riffle. The nested variation in community composition was characterized using a modification of Underwood and Chapman’s (1998) technique. Total variation in composition among the 180 communities was described with the Bray-Curtis distance (minimum = 0, maximum = 1) between each community and the overall mean proportion of each taxon across the 180 communities. For each scale in the hierarchy (streams, reaches within streams, riffles within reaches, and points within riffles) we then calculated the median of Bray-Curtis distances between units at a given scale and the mean community for that scale.

Nested ANOVA was also used to partition variation in substrate (Substrate PC1, PC2) into stream, reach, riffle, and sampling point components, and habitat assessment scores (Habitat PC1, PC2, PC3) between stream and reach components. The covariation of the macroinvertebrate community and its environment was partitioned among the spatial scales using a nested ANOVA. Correlations between richness (S) and pollution tolerance (BI) of the biota, and the substrate (Substrate PC1, PC2) at the stream, reach, riffle, and point scales were calculated using the sum of squares and cross products matrix for each scale. Similarly, correlations between richness and pollution tolerance of the biota, and the habitat assessment scores (Habitat PC1, PC2, PC3) were calculated for the stream and reach scales, since only one habitat assessment was done at each reach.

Results

There was a total of 135 taxa found at the 180 sampling sites, with a median of 31 taxa observed per kick sample (minimum = 18, maximum = 45). Hilsenhoff’s Biotic Index of tolerance to organic pollution, calculated for a community at a given kick sample, varied from 1.50 (excellent water quality) to 5.96 (fair water quality).

There was considerable variation among kick sample points in the substrate characteristics (Table 2). The dominant particle size was cobble (median 35%), with very little sand, silt, and clay present at any point (median total of sand, silt, and clay = 11%). Two principal components explained almost 75% of the variation in substrate descriptors. Substrate PC1 contrasted sampling points with gravel and some silt and sand present (positive values) to riffle areas dominated by cobble (negative values). Substrate PC2 contrasted riffle areas with gravel and pebble substrate (positive values) to those with boulders (negative values).

Table 2 Substrate particle size variability among 180 sampling points (10 streams × 2 reaches/stream × 3 riffles/reach × 3 points)

Total habitat assessment scores varied widely from 61 to 142 out of a maximum of 165 (Table 3). Substrate embeddedness, channel alteration, and frequency of riffles showed the greatest variability among reaches. Habitat PC1 contrasted reaches with little substrate cover, high embeddedness of substrate, lots of fine sediment deposition, and little riffle habitat (negative values) to those with plenty of cover, low embeddedness and deposition of fines, and plenty of riffle habitat (positive values). Habitat PC2 was a gradient of little variation in velocity/depth regimes within the reach and poorly developed riparian vegetation (negative values), to reaches with variable velocity/depth regimes and well developed, treed riparian vegetation (positive values). Habitat PC3 contrasted reaches with unstable banks and poor riparian vegetation (negative values) to those with stable banks and developed riparian vegetation (positive values).

Table 3 EPA Habitat assessment scores (derived from Plafkin et al., 1989) as estimated for each of the 10 streams × 2 reaches/stream = 20 reaches

Nested ANOVA showed over 75% of the variability in both taxonomic richness (S) and tolerance of the community to pollution (BI) was among streams (Table 4). Variation among kick samples within a riffle was a distant second place in variation of richness (17%), while variation between reaches of a stream was the second most important component for the biotic index of tolerance to pollution (14%). The multivariate gradients (known as canonical variates, CVs) best distinguishing streams (Fig. 1) were most related to tolerance (Stream CV1) and richness (Stream CV2). When the communities from each stream were plotted using the first two canonical variate scores, a clear gradient of stream communities from those with fewer and less tolerant taxa (e.g., North Thames River) to those with a richer, more tolerant community (e.g., Medway Creek) was observed (Fig. 2).

Table 4 Variance (and % of total) of nested effects for S (richness, number of invertebrate taxa), BI (biotic index, a measure of the average tolerance of community members to organic pollution)
Fig. 1
figure 1

Variation and covariation of benthic invertebrate community descriptors (S, BI) and the canonical variate scores that best describe variation among streams in communities (StreamCV1, StreamCV2)

Fig. 2
figure 2

Variation among streams in canonical variate scores that best describe variation among streams in communities (StreamCV1, StreamCV2)

As measured by the Bray-Curtis distance to the average community (minimum = 0 for a community exactly like average community; maximum = 1 for a community very different from average community), the composition of the communities varied mainly among streams (Table 5). There was as much variation in composition among kick sample points in a riffle as there was between reaches of a stream. Ordination based on the composition of the invertebrate community showed that in some cases (Avon River, Fish Creek, Phelan Creek) upstream and downstream reaches were well differentiated while in other streams (Dingman Creek, Kintore Creek, Waubuno Creek) there was as much variation among riffles and kick sample points as there was between upstream and downstream reaches (Fig. 3). In a couple of cases (Gregory Creek, Medway Creek), a particular riffle had communities distinct from the two other riffles at that reach and the three riffles at the other reach.

Table 5 Hierarchical variation in composition of macroinvertebrate communities as reflected by the Bray-Curtis distance (minimum = 0, maximum = 1) to the average community at a given scale
Fig. 3
figure 3

Non-metric Multidimensional Scaling (NMDS) ordination plots of the 180 benthic invertebrate communities based on Bray-Curtis distances among communities calculated with the proportions of each taxon in the community. In each of the 10 plots, we plot the same points but label communities from the stream indicated. The label characters describe the Stream, Reach, Riffle, and Point (e.g., “AvDnUpRt” is from the Avon River, the downstream reach, the furthest upstream riffle, and the right hand point within this riffle)

Substrate varied mostly at the reach (Substrate PC1) and stream and reach (Substrate PC2) scales, with little variation observed among riffles at a reach or among points within a riffles (Table 4). Habitat characteristics varied much more (Habitat PC1 and PC3) or almost as much (Habitat PC2) between reaches within a stream as they did among streams (Table 4).

The total correlation of biota and substrate showed greater taxonomic richness and pollution tolerance in areas with finer substrate (i.e., riffles with more gravel and not so much cobble). Most of this correlation was at larger spatial scales of stream and reach, rather than the smaller scales riffles and kick samples within riffles (Table 6). Correlation between habitat assessment and biota was more prominent at the among stream than the between reach scale (Table 7). Counterintuitively, there tended to be fewer, more pollution tolerant taxa in streams with a variety of velocity and depth regimes and treed riparian vegetation.

Table 6 Hierarchical covariation of benthic macroinvertebrate communities and the substrate particle size distribution
Table 7 Hierarchical covariation of benthic macroinvertebrate communities and the habitat assessment scores

Discussion

We have illustrated in our study of 10 streams in a highly agricultural area (median agriculture land cover ∼90%) that stream ecosystems sampled at the within riffle grain primarily vary at larger spatial scales, including the scale of the stream itself. This included both summary indices that described the diversity of the benthic invertebrate community (S, taxonomic richness), its tolerance of organic pollution (BI, the biotic index), and the taxonomic composition of the community (Bray-Curtis distance of the community from the mean at a given scale). This is not to deny or ignore variation, sometimes statistically significant, at smaller spatial scales in these descriptors of the community. It just argues that a credible assessment of a set of sites in a given area can be effectively carried out if one reach per stream is sampled. Important differences among streams will be detected by such a sampling design, particularly with respect to the sort of descriptors used for assessment. This is reassuring, since of course the status quo of environmental assessment is to sample but one site per stream, and relate the biota found at that site to that expected in reference condition (Bailey et al., 2004). This is at odds with Heino et al. (2004), who in a multi-scale, nested design study similar to ours found most of the variability in benthic invertebrate communities at smaller spatial scales and warned of inadequate replication in bioassessments. Unlike our study, Heino et al. (2004) were concerned with abundance measures, including abundance of functional feeding groups, rather than commonly used measures of the health of a community (diversity, tolerance). They may indeed be correct in suggesting caution if one is wanting to characterize the functional structure of the ecological community rather than its response to stressors in the environment.

A continuing, critical part of pure stream ecology, and an increasingly important aspect of applied, assessment stream ecology, is characterization of relationships between the stream environment and stream biota. This study, in the same manner as Bailey (1988), uses a description of the biota observed at a single grain size (the point within a riffle) to measure and characterize correlations with the environment at the same grain size (for substrate) and a larger grain size (the reach, for habitat assessment). These correlations were quantified at the various spatial scales of the study design. We found that correlations between the biota and its community tended to be most prominent at larger scales (the stream or at least the reach). This again bodes well for the typical single reach per stream assessment study. It provides hope that important correlations between the environment of the stream and its biota will be detected even if just one riffle at one reach per stream is sampled. Johnson et al. (2004) looked exclusively at community composition in stony bottomed streams (and lakes) in a geographically and ecologically much larger study than ours. They found that the greatest strength of correlation between the community and its environment was at smaller scales of ecosystem (like our stream scale) and habitat descriptors rather than large scale, landscape and regional descriptors.

It is an ecological truism to assert that ecosystems are hierarchically structured. In stream ecology, particularly that aspect concerned with the environmental assessment of streams, we tend to deal with this in two ways:

  1. (i)

    Mention the importance of scale and hierarchies and then create studies that have a mixture of scales of data acquisition (e.g., the field sampling site), data analysis (e.g., the catchment area of the site, or perhaps the site), and interpretation and action as a result of the study (e.g., the stream as a whole)

  2. (ii)

    Make the overarching goal of the study to find the right scales or holons for stream ecology, firmly rooted in hierarchy theory (e.g., Parsons et al., 2004a, 2004b).

In this study, we have purposefully avoided an assertion as to the correct scales to sample a stream, or the holons that exist in the ecological hierarchy of stream ecosystems (cf. Parsons et al., 2004). We accept, of course, that the spatial and temporal scales that we design our studies and analyse and interpret our data will affect the observations and conclusions we make. But rather than seeking, potentially in vain, scales that comprise the structure and function of larger scales and constrain the structure and function of smaller scales (what hierarchy theorists such as Parsons et al., (2004) would call holons), we use arbitrary, traditional observation scales (stream, reach, riffle, point) and let the data tell us how the ecosystem varies, and how the environment and biota covary. We see no utility in imposing a theoretical model, even if at least partially based on known functions of the ecosystem (e.g., geomorphological processes), on the hierarchical patterns of ecosystems observed in nature. In nature, holons form a continuum of two-way doors in a long and convoluted network of hallways. So we feel the most productive approach to understanding stream ecosystems is awareness of the doorway one is at or near when asking a certain question (e.g., “Is this stream OK?”, “What controls crayfish distribution among points at this site?”), and at which doorways one can go to collect data that will answer such questions most efficiently.