Abstract
We developed and evaluated empirical models to predict biological condition of wadeable streams in a large portion of the eastern USA, with the ultimate goal of prediction for unsampled basins. Previous work had classified (i.e., altered vs. unaltered) the biological condition of 920 streams based on a biological assessment of macroinvertebrate assemblages. Predictor variables were limited to widely available geospatial data, which included land cover, topography, climate, soils, societal infrastructure, and potential hydrologic modification. We compared the accuracy of predictions of biological condition class based on models with continuous and binary responses. We also evaluated the relative importance of specific groups and individual predictor variables, as well as the relationships between the most important predictors and biological condition. Prediction accuracy and the relative importance of predictor variables were different for two subregions for which models were created. Predictive accuracy in the highlands region improved by including predictors that represented both natural and human activities. Riparian land cover and road-stream intersections were the most important predictors. In contrast, predictive accuracy in the lowlands region was best for models limited to predictors representing natural factors, including basin topography and soil properties. Partial dependence plots revealed complex and nonlinear relationships between specific predictors and the probability of biological alteration. We demonstrate a potential application of the model by predicting biological condition in 552 unsampled basins across an ecoregion in southeastern Wisconsin (USA). Estimates of the likelihood of biological condition of unsampled streams could be a valuable tool for screening large numbers of basins to focus targeted monitoring of potentially unaltered or altered stream segments.
Similar content being viewed by others

Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Allan, J. D. (2004). Landscapes and riverscapes: The influence of land use on stream ecosystems. Annual Review of Ecology, Evolution, and Systematics, 35, 257–284.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. New York: Springer.
Carline, R. F., & Walsh, M. C. (2007). Responses to riparian restoration in the spring creek watershed, central Pennsylvania. Restoration Ecology, 15, 731–742.
Carlisle, D. M., & Meador, M. R. (2007). A predictive model for the biological condition of macroinvertebrate assemblages in eastern U.S. streams. Journal of the American Water Resources Association, 43, 1194–1207.
Clark, R. T., Furse, M. T., Wright, J. F., & Moss, D. (1996). Derivation of a biological quality index for river sites: comparison of the observed with the expected fauna. Journal of Applied Statistics, 23, 311–332.
Clark, R. T., Wright, J. F., & Furse, M. T. (2003). RIVPACS models for predicting the expected macroinvertebrate fauna and assessing the ecological quality of rivers. Ecological Modelling, 160, 219–233.
Cuffney, T. F., Gurtz, M. E., & Meador, M. R. (1993). Methods for collecting benthic macroinvertebrate samples as part of the National Water-Quality Assessment Program. Open File Report 93-406, US Geological Survey.
Cutler, D. R., Edwards, T. C., Jr., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., et al. (2007). Random forests for classification in ecology. Ecology, 88, 2783–2792.
Davies, S. P., & Jackson, S. K. (2006). The Biological Condition Gradient: a conceptual model for interpreting detrimental change in aquatic ecosystems. Ecological Applications, 16, 1251–1266.
Davies, N. M., Norris, R. H., & Thoms, M. C. (2000). Predication and assessment of local stream habitat features using large-scale catchment characteristics. Freshwater Biology, 45, 343–369.
Daymet 2006. Numerical Terradynamic Simulation Group: University of Montana. Retrieved from www.daymet.org.
De’ath, G., & Fabricus, K. E. (2000). Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology, 81, 3178–3192.
Fielding, A. H., & Bell, J. F. (1997). A review of methods for the assessment of prediction errors in conservation presence/absence models. Environmental Conservation, 24, 38–49.
Garzón, M. B., Blazek, R., Neteler, M., Sánchez de Dios, R., Ollero, H. S., & Furlanello, C. (2006). Predicting habitat suitability with machine learning models: The potential area of Pinus sylvestris L. in the Iberian Peninsula. Ecological Modelling, 197, 383–393.
GeoLytics (2001). CensusCD 2000 and StreetCD 2000 CDROM. GeoLytics, Inc., East Brunswick: New Jersey.
Gilliom, R. J., Alley, W. A., & Gurtz, M. E. (1995). Design of the National Water-Quality Assessment Program: Occurrence and distribution of water-quality conditions. US Geological Survey Circular 1112, Sacramento, California.
Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning: data mining, inference, and prediction. New York: Springer.
Hawkins, C. P. (2006). Quantifying biological integrity by taxonomic completeness: its utility in regional and global assessments. Ecological Applications, 16, 1277–1294.
Hawkins, C. P., & Carlisle, D. M. (2001). Use of predictive models for assessing the biological integrity of wetlands and other aquatic habitats. In R. B. Rader, D. P. Batzer, & S. A. Wissinger (Eds.), Bioassessment and management of North American wetlands (pp. 59–83). New York: Wiley.
Hawkins, C. P., Norris, R. H., Hogue, J. N., & Feminella, J. W. (2000). Development and evaluation of predictive models for measuring the biological integrity of streams. Ecological Applications, 10, 1456–1477.
Heinz Center (2002). The state of the nation’s ecosystems: Measuring the lands, waters, and living resources of the United States. The H. John Heinz III Center for Science, Economics and the Environment, 1001 Pennsylvania Ave, NW Suite 735 South, Washington, DC.
Horizon Systems Corporation (2006). National Hydrography Dataset Plus (NHDPlus) Home: Horizon Systems Corporation. Retrieved August 2006 from http://www.horizon-systems.com/nhdplus/.
King, R. S., Baker, M. E., Whigham, D. F., Weller, D. E., Jordan, T. E., Kazyak, P. F., et al. (2005). Spatial considerations for linking watershed land cover to ecological indicators in streams. Ecological Applications, 15, 137–153.
Lawler, J. J., White, D., Neilson, R. P., & Blaustein, A. R. (2006). Predicting climate-induced range shifts: Model differences and model reliability. Global Climate Change Biology, 12, 1568–1584.
Liaw, A., & Wiener, M. (2002). Classification and regression by random Forest. R News, 2/3, 18–22.
Moore, A. A., & Palmer, M. A. (2005). Invertebrate biodiversity in agricultural and urban headwater streams: Implications for conservation and management. Ecological Applications, 15, 1169–1177.
Moss, D., Furse, M. T., Wright, J. F., & Armitage, P. D. (1987). The prediction of the macro-invertebrate fauna of unpolluted running-water sites in Great Britain using environmental data. Freshwater Biology, 17, 41–52.
Moulton, S. R., II, Carter, J. L., Grotheer, S. A., Cuffney, T. F., & Short, T. M. (2000). Methods of analysis by the US Geological Survey National Water Quality Laboratory: Processing, taxonomy, and quality control of benthic macroinvertebrate samples. Open File Report 00-212, US Geological Survey.
Moulton, S. R., II, Kennen, J. G., Goldstein, R. M., & Hambrook, J. A. (2002). Revised Protocols for Sampling Algal, Invertebrate, and Fish Communities as Part of the National Water-Quality Assessment Program. Open-file Report 02-150, US Geological Survey.
Nilsson, C., Pizzuto, J. E., Moglen, G. E., Palmer, M. A., Stanley, E. H., Bockstael, N. E., et al. (2003). Ecological forecasting and the urbanization of stream ecosystems: challenges for economists, hydrologists, geomorphologists, and ecologists. Ecosystems, 6, 659–674.
Ostermiller, J. D., & Hawkins, C. P. (2004). Effects of sampling error on bioassessments of stream ecosystems: applications to RIVPACS-type models. Journal of the North American Benthological Society, 23, 363–382.
Parmenter, A. W., Hansen, A., Kennedy, R. E., Cohen, W., Langer, U., Lawrence, R., et al. (2003). Land use and land cover change in the Greater Yellowstone ecosystem: 1975–1995. Ecological Applications, 13, 687–703.
Paul, M. J., & Meyer, J. L. (2001). Streams in the urban landscape. Annual Review of Ecology and Systematics, 32, 333–365.
Paulsen, S. G., Hughes, R. M., & Larson, D. P. (1998). Critical elements in describing and understanding our nation’s aquatic resources. Journal of the American Water Resources Association, 34, 995–1005.
Potter, K. M., Cubbage, F. W., Blank, G. B., & Schaeberg, R. H. (2004). A watershed-scale model for predicting nonpoint pollution risk in North Carolina. Environmental Management, 34, 62–74.
Prasad, A. M., Iverson, L. R., & Liaw, A. (2007). Newer classification and regression tree techniques: bagging and random forests for ecological predictions. Ecosystems, 9, 181–199.
Pyne, M. I., Rader, R. R., & Christensen, W. F. (2007). Predicting local biological characteristics in streams: a comparison of landscape classifications. Freshwater Biology, 52, 1302–1321.
R Development Core Team (2006). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. Retrieved from http://www.R-project.org.
Richards, C., Host, G. E., & Arthur, J. W. (1993). Identification of predominant environmental factors structuring stream macroinvertebrate communities within a large agricultural catchment. Freshwater Biology, 29, 285–294.
Roy, A. H., Freeman, M. C., Freeman, B. J., Wenger, S. J., Meyer, J. L., & Ensign, W. E. (2006). Importance of riparian forests in urban catchments contingent on sediment and hydrologic regimes. Environmental Management, 37, 523–539.
Ruddy, B. C., Lorenz, D. L., Mueller, D. K. (2006). County-level estimates of nutrient inputs to the land surface of the conterminous United States, 1982–2001: US Geological Survey Scientific Investigations Report 2006-5012.
Strayer, D. L., Beighley, R. E., Thompson, L. C., Brooks, S., Nilsson, C., Pinay, G., et al. (2003). Effects of land cover on stream ecosystems: Roles of empirical models and scaling issues. Ecosystems, 6, 407–423.
Svetnik, V., Liaw, A., Tong, C., Culberson, J. C., Sheridan, R. P., & Feuston, B. P. (2003). Random forest: a classification and regression tool for compound classification and QSAR modeling. Journal of Chemical Information and Computer Science, 43, 1947–1958.
Tate, C. M., Cuffney, T. F., McMahon, G., Giddings, E. M. P., & Zappia, H. (2005). Use of an urban intensity index to assess urban effects on streams in three contrasting environmental settings. In L. R. Brown, R. H. Gray, R. M. Hughes, & M. R. Meador (Eds.), Effects of urbanization on stream ecosystems (pp. 291–316). Maryland: American Fisheries Society Symposium 47.
Urban, D. L. (2002). Classification and regression trees. In B. McCune & J. B. Grace (Eds.), Analysis of ecological communities (pp. 222–232). Oregon: MjM Software Design.
US Army Corps of Engineers (2006). National Inventory of Dams: U.S. Army Corps of Engineers. Retrieved July 2006 from http://crunch.tec.army.mil/nidpublic/webpages/nid.cfm.
US Department of Agriculture (2006). US General Soil Map (STATSGO)|NRCS NCGC. Retrieved from http://www.ncgc.nrcs.usda.gov/products/datasets/statsgo/.
US Environmental Protection Agency (2000). Mid-Atlantic highlands stream assessment. EPA/903/R-00/015. U.S. Environmental Protection Agency, Philadelphia, Pennsylvania.
US Environmental Protection Agency (2006a). Draft wadeable streams assessment: A collaborative survey of the Nation’s streams. EPA 841-B-06-002. Office of Water, Washington DC.
US Environmental Protection Agency (2006b). National Pollutant Discharge Elimination System (NPDES): US Environmental Protection Agency. Retrieved June 2006 from http://cfpub.epa.gov/npdes/.
US General Accounting Office (GAO) (2002). Water quality: Inconsistent state approaches complicate nation’s efforts to identify its most polluted waters. GAO-02-186. United State General Accounting Office, 441 G. Street NW, Washington, DC.
US Geological Survey (2006a). USGS National Water-Quality Assessment Program (NAWQA). Retrieved from http://water.usgs.gov/nawqa/.
US Geological Survey (2006b). MRLC consortium: National Land Cover Dataset (NLCD). Retrieved from http://www.mrlc.gov/.
US Geological Survey (2006c). National Hydrography Dataset (NHD) Home Page. Retrieved from http://nhd.usgs.gov/.
US Geological Survey (2006d). National Elevation Dataset (NED). Retrieved from http://ned.usgs.gov.
US Geological Survey (2007). Grids of agricultural pesticide use in the conterminous United States, 1997: US Geological Survey. Retrieved June 2007 from http://water.usgs.gov/GIS/metadata/usgswrd/XML/agpest97grd.xml.
Van Sickle, J., Baker, J., Herlihy, A., Bayley, P., Gregory, S., Haggerty, P., et al. (2004). Projecting the biological condition of streams under alternative scenarios of human land use. Ecological Applications, 14, 368–380.
Van Sickle, J., Hawkins, C. P., Larsen, D. P., & Herlihy, A. T. (2005). A null model for the expected macroinvertebrate assemblage in streams. Journal of the North American Benthological Society, 24, 178–191.
Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with S. New York: Springer.
Vølstad, J. H., Roth, N. E., Mercurio, G., Southerland, M. T., & Strebel, D. E. (2003). Using environmental stressor information to predict the ecological status of Maryland non-tidal streams as measured by biological indicators. Environmental Monitoring and Assessment, 84, 219–242.
Wallace, J. B., Eggert, S. L., Meyer, J. L., & Webster, J. R. (1997). Multiple trophic levels of a stream linked to terrestrial litter inputs. Science, 277, 102–104.
Walsh, C. J., Roy, A. H., Feminella, J. W., Cottingham, P. D., Groffman, P. M., & Morgan, R. P., II (2005). The urban stream syndrome: current knowledge and the search for a cure. Journal of the North American Benthological Society, 24, 706–723.
Wolock, D. M., Fan, J., & Lawrence, G. B. (1997). Effects of basin size on low-flow stream chemistry and subsurface contact time in the Neversink River watershed, New York. Hydrological Processes, 11, 1273–1286.
Wolock, D. M., & McCabe, G. J. (1995). Comparison of single and multiple flow-direction algorithms for computing topographic parameters in TOPMODEL. Water Resources Research, 31, 1315–1324.
Wright, J. F., Sutcliffe, D. W., & Furse, M. T. (Eds.) (2000). Assessing the biological quality of fresh waters: RIVPACS and other techniques. United Kingdom: Freshwater Biological Association.
Yates, A. G., & Bailey, R. C. (2006). The stream and its altered valley: Integrating landscape ecology into environmental assessments of agro-ecosystems. Environmental Monitoring and Assessment, 114, 257–271.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Carlisle, D.M., Falcone, J. & Meador, M.R. Predicting the biological condition of streams: use of geospatial indicators of natural and anthropogenic characteristics of watersheds. Environ Monit Assess 151, 143–160 (2009). https://doi.org/10.1007/s10661-008-0256-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10661-008-0256-z

