Using Random Forests to Provide Predicted Species Distribution Maps as a Metric for Ecological Inventory & Monitoring Programs
Sustainable management efforts are currently hindered by a lack of basic information about the spatial distribution of species on large landscapes. Based on complex ecological databases, computationally advanced species distribution models can provide great progress for solving this ecological problem. However, current lack of knowledge about the ecological relationships that drive species distributions reduces the capacity for classical statistical approaches to produce accurate predictive maps. Advancements in machine learning, like classification and bagging algorithms, provide a powerful tool for quickly building accurate predictive models of species distributions even when little ecological knowledge is readily available. Such approaches are also well known for their robustness when dealing with large data sets that have low quality. Here, we used Random Forests (Salford System’s Ltd. and R language), a highly accurate bagging classification algorithm originally developed by L. Breiman and A. Cutler, to build multi-species avian distribution models using data collected as part of the Kenai National Wildlife Refuge Long-term Ecological Monitoring Program (LTEMP). Distribution maps are a useful monitoring metric because they can be used to document range expansions or contractions and can also be linked to population estimates. We utilized variable radius point count data collected in 2004 and 2006 at 255 points arranged in a 4.8 km resolution, systematic grid spanning the 7722 km2 spatial extent of Alaska’s Kenai National Wildlife Refuge. We built distribution models for 40 bird species that are present within 200m of 2–56% of the sampling points resulting in models that represent species which are both rare and common on the landscape. All models were built using a common set of 157 environmental predictor variables representing topographical features, climatic space, vegetation, anthropogenic variables, spatial structure, and 5 randomly generated neutral landscape variables for quality assessment. Models with that many predictors have not been used before in avian modeling, but are commonly used in similar types of applications in commercial disciplines. Random Forests produced strong models (ROC >0.8) for 16 bird species, marginal models (0.7 >ROC <0.8) for 13 species, and weak models (ROC <0.7) for 11 species. The ability of Random Forests to provide accurate predictive models was independent of how common or rare a bird was on the landscape. Random Forests did not rank any of the 5 neutral landscape variables as important for any of the 41 bird species. We argue that for inventory and monitoring programs the interpretive focus and confidence in reliability should be placed in the predictive ability of the map, and not in the assumed ecological meaning of the predictors or their linear relationships to the response variable. Given this focus, computer learning algorithms would provide a very powerful, cost-saving approach for building reliable predictions of species occurrence on the landscape given the current lack of knowledge on the ecological drivers for many species. Land management agencies need reliable predictions of current species distributions in order to detect and understand how climate change and other landscape drivers will affect future biodiversity.
Unable to display preview. Download preview PDF.
- 4.Lunetta R S, Elvidge C D (eds.) (1998) Remote sensing change detection: environmental monitoring methods and applications. Ann Arbor Press, Chelsea, MichiganGoogle Scholar
- 6.Busch D E, Trexler J C (eds.) 2003 Monitoring ecosystems: interdisciplinary approaches for evaluating ecoregional initiatives, Island Press, Washington Covelo LondonGoogle Scholar
- 7.Holthausen R, Czaplewski R L, DeLorenzo D, Hayward G, Kessler W B, Manley P, McKelvey K S, Powell D S, Ruggiero L F, Schwartz M K, Van Horne B, Vojta C D (2005) Strategies for monitoring terrestrial animals and habitats. General Technical Report RMRS–GTR–161, U.S. Department of Agriculture, Forest Service, Rocky Mountain Research Station, Fort Collins, ColoradoGoogle Scholar
- 11.Heglund P J (2002) Foundations of species-environment relations. In: Scott J M, Heglund P J, Morrison M L, Hauer J B, Raphael M G, Wall W A, Samson F B (eds.). Predicting species occurrences: issues of accuracy and scale. Island Press, Washington Covelo LondonGoogle Scholar
- 17.Dawson D W, Smith D R, Robbins C S (1995) Point count length and detection of forest neotropical migrant birds. In Ralph CJ, Sauer J R, Droege S (eds.). Monitoring bird populations by point counts. General Technical Report RMRS– GTR–149, U.S. Department of Agriculture, Forest Service, Pacific Southwest Research Station, Albany, CaliforniaGoogle Scholar
- 18.McGarigal K, Marks B J (1995) FRAGSTATS: spatial pattern analysis program for quantifying landscape structure. General Technical Report PNW–351, U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station, Corvallis, OregonGoogle Scholar
- 24.MacKenzie D L, Nichols J D, Royle J A, Pollock K H, Bailey L L, Hines J E (2006) Occupancy estimation and modeling:inferring patterns and dynamics of species occurrence. Academic Press, Amsterdam Boston Heidelberg London New York Oxford Paris San Diego San Francisco Singapore Sydney TokyoGoogle Scholar
- 27.Crozier L (2002) Climate change and its effect on species range boundaries: a case study of the Sachem Skipper buttery, Atalopedes campestris. In: Schneider S H, Root T L (eds.) Wildlife responses to climate change. Island Press, Washington Covelo LondonGoogle Scholar
- 28.Elith J, Graham C H, Anderson R P, Dudik M, Ferrier S, Guisan A, Hijmans R J, Huettmann F, Leathwick J R, Lehmann A, Li J, Lohmann L G, Loiselle B A, Manion G, Moritz C, Nakamura M, Nakazawa Y, Overton J M, Peterson A T, Phillips S J, Richardson K, Scachetti-Pereira R, Schapire R E, Soberon J, Williams S, Wisz M S, Zimmermann N E (2006) Ecography 29:129–151CrossRefGoogle Scholar