Landscape Ecology

, Volume 24, Issue 5, pp 673–683 | Cite as

Gradient modeling of conifer species using random forests

Research Article


Landscape ecology often adopts a patch mosaic model of ecological patterns. However, many ecological attributes are inherently continuous and classification of species composition into vegetation communities and discrete patches provides an overly simplistic view of the landscape. If one adopts a niche-based, individualistic concept of biotic communities then it may often be more appropriate to represent vegetation patterns as continuous measures of site suitability or probability of occupancy, rather than the traditional abstraction into categorical community types represented in a mosaic of discrete patches. The goal of this paper is to demonstrate the high effectiveness of species-level, pixel scale prediction of species occupancy as a continuous landscape variable, as an alternative to traditional classified community type vegetation maps. We use a Random Forests ensemble learning approach to predict site-level probability of occurrence for four conifer species based on climatic, topographic and spectral predictor variables across a 3,883 km2 landscape in northern Idaho, USA. Our method uses a new permutated sample-downscaling approach to equalize sample sizes in the presence and absence classes, a model selection method to optimize parsimony, and independent validation using prediction to 10% bootstrap data withhold. The models exhibited very high accuracy, with AUC and kappa values over 0.86 and 0.95, respectively, for all four species. The spatial predictions produced by the models will be of great use to managers and scientists, as they provide vastly more accurate spatial depiction of vegetation structure across this landscape than has previously been provided by traditional categorical classified community type maps.


Predictive modeling Random forests CART Gradient 


  1. Austin MP (2002) Spatial prediction of species distribution: an interface between ecological theory and statistical modelling. Ecol Model 157:101–118CrossRefGoogle Scholar
  2. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140Google Scholar
  3. Brieman L (2001a) Statistical modeling: the two cultures. Stat Sci 16(3):199–231. doi:10.1214/ss/1009213726 CrossRefGoogle Scholar
  4. Breiman L (2001b) Random forests. Mach Learn 45:5–32. doi:10.1023/A:1010933404324 CrossRefGoogle Scholar
  5. Chavez PS (1988) An improved dark-object subtraction technique for atmospheric scattering correction of multispectral data. Remote Sens Environ 24:459–479. doi:10.1016/0034-4257(88)90019-3 CrossRefGoogle Scholar
  6. Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEboost: improving prediction of the minority class in boosting. In: 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp 107–119Google Scholar
  7. Chen C, Liaw A, Breiman L (2004) Using random forest to learn imbalanced data.
  8. Crookston NL, Finley AO (2008) yaImpute: an R package for kNN imputation. J Stat Softw 23:1–16Google Scholar
  9. Curtis JT, McIntosh RP (1951) An upland forest continuum in the prairie-forest border region of Wisconsin. Ecol Monogr 32:476–496Google Scholar
  10. Cushman SA, McKenzie D, Peterson DL, Littell J, McKelvey KS (2007) Research agenda for integrated landscape modelling. USDA Forest Service General Technical Report RMRS-GTR-194Google Scholar
  11. Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT, Gibson J, Lawler J (2007) Random forests for classification in ecology. Ecology 88:2783–2792. doi:10.1890/07-0539.1 PubMedCrossRefGoogle Scholar
  12. Déath G, Fabricius KE (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81:3178–3192CrossRefGoogle Scholar
  13. DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing the area under tow or more correlated receiver operating characteristics curves: a nonparametric approach. Biometrics 59:837–845. doi:10.2307/2531595 CrossRefGoogle Scholar
  14. Evans IS (1972) General geomorphometry, derivatives of altitude, and descriptive statistics. In: Chorley RJ (ed) Spatial analysis in geomorphology. Harper & Row, New York, pp 17–90Google Scholar
  15. Forman RTT (1995) Land mosaics: the ecology of landscapes and regions. Cambridge University Press, CambridgeGoogle Scholar
  16. Forman RTT, Godron M (1986) Landscape ecology. John Wiley & Sons, New YorkGoogle Scholar
  17. Freeman EA, Moisen G (2008) Presence absence: an R package for presence absence analysis. J Stat Softw 23(11):31Google Scholar
  18. Fu P, Rich PM (1999) Design and implementation of the solar analyst: an ArcView extension for modeling solar radiation at landscape scales. Proceedings of the 19th Annual ESRI User Conference, San Diego, USA,
  19. Gleason HA (1926) The individualistic concept of the plant association. Bull Torrey Bot Club 53:7–26. doi:10.2307/2479933 CrossRefGoogle Scholar
  20. Guisan A, Zimmermann NE (2000) Predictive habitat distribution model in ecology. Ecol Model 135:147–186. doi:10.1016/S0304-3800(00)00354-9 CrossRefGoogle Scholar
  21. Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning. Springer, New YorkGoogle Scholar
  22. Hutchinson GE (1957) Concluding remarks. Cold Spring Harb Symp Quant Biol 22:415–427Google Scholar
  23. Liaw A, Wiener M (2002) Classification and regression by random forest. R news: the newsletter of the R project ( 2(3):18–22
  24. Manning AD, Lindenmayer DB, Nix HA (2004) Continua and umwelt: novel perspectives on viewing landscapes. Oikos 104:621–628. doi:10.1111/j.0030-1299.2004.12813.x CrossRefGoogle Scholar
  25. McCune B, Keon D (2002) Equations for potential annual direct incident radiation and heat load index. J Veg Sci 13:603–606. doi:10.1658/1100-9233(2002)013[0603:EFPADI]2.0.CO;2 CrossRefGoogle Scholar
  26. McGarigal K, Cushman SA (2005) The gradient concept of landscape structure. In: Wiens J, Moss M (eds) Issues and perspectives in landscape ecology. Cambridge University Press, Cambridge, pp 112–119Google Scholar
  27. McGarigal K, Tagil S, Cushman SA (2009) Surface metrics: an alternative to patch metrics for the quantification of landscape structure. Landscape Ecol 24:433–450CrossRefGoogle Scholar
  28. McIntyre S, Barrett GW (1992) Habitat variegation, an alternative to fragmentation. Conserv Biol 6:146–147. doi:10.1046/j.1523-1739.1992.610146.x CrossRefGoogle Scholar
  29. Moore ID, Gessler PE, Nielsen GA, Petersen GA (1993) Terrain attributes: estimation methods and scale effects. In: Jakeman AJ, Beck MB, McAleer M (eds) Modeling change in environmental systems. Wiley, London, pp 189–214Google Scholar
  30. Morrison D (2002) Multivariate statistical methods. McGraw-Hill series in probability and statistics, 4th edn. McGraw-Hill, New YorkGoogle Scholar
  31. Murphy MA, Evans JS, Storfer AS (2009) Quantifying Bufo boreas connectivity in Yellowstone National Park with landscape genetics. Ecology (in press)Google Scholar
  32. Nemani R, Pierce L, Running S, Brand L (1993) Forest ecosystem processes at the watershed scale; sensitivity to remotely-sensed leaf-area index estimates. Int J Remote Sens 14:2519–2534. doi:10.1080/01431169308904290 CrossRefGoogle Scholar
  33. Park Y-S, Chon T-S (2006) Biologically inspired machine learning implemented to ecological informatics. Ecol Inform 203:1–7Google Scholar
  34. Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecol Model 190:231–259. doi:10.1016/j.ecolmodel.2005.03.026 CrossRefGoogle Scholar
  35. Prasad AM, Iverson LR, Liaw A (2006) Random forests for modeling the distribution of tree abundances. Ecosystems (N Y, Print) 9:181–199. doi:10.1007/s10021-005-0054-1 CrossRefGoogle Scholar
  36. R Development Core Team (2007) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria. ISBN 3-900051-07-0, URL
  37. Rabus B, Eineder M, Roth A, Bamler R (2003) The shuttle radar topography mission—a new class of digital elevation models acquired by spaceborne radar. Photogramm Eng Remote Sens 57:241–262. doi:10.1016/S0924-2716(02)00124-7 CrossRefGoogle Scholar
  38. Rehfeldt GE, Crookston NL, Warwell MV, Evans JS (2006) Empirical analyses of plant–climate relationships for the western United States. Int J Plant Sci 167(6):1123–1150. doi:10.1086/507711 CrossRefGoogle Scholar
  39. Roberts DW, Cooper SV (1989) Concepts and techniques of vegetation mapping, land classifications based on vegetation: applications for resource management. GTR-INT-257, USDA Forest Service Intermountain Research Station, Ogden, UT, pp 90–96Google Scholar
  40. Schapire R (1990) Strength of weak learnability. J Mach Learn 5:197–227Google Scholar
  41. Stage AR (1976) An expression of the effects of aspect, slope, and habitat type on tree growth. For Sci 22(3):457–460Google Scholar
  42. Stockwell DRB, Peters DP (1999) The GARP modeling system: problems and solutions to automated spatial prediction. Int J Geogr Inf Syst 13:143–158. doi:10.1080/136588199241391 CrossRefGoogle Scholar
  43. Svetnik V, Liaw A, Tong C, Wang T (2004) Application of Breiman’s random forest to modeling structure–activity relationships of pharmaceutical molecules. In: Roli F, Kittler J, Windeatt T (eds) Lecture notes in computer science, vol 3077. Springer, Berlin, pp 334–343Google Scholar
  44. Tarboton DG (1997) A new method for the determination of flow directions and contributing areas in grid digital elevation models. Water Resour Res 33(2):309–319. doi:10.1029/96WR03137 CrossRefGoogle Scholar
  45. Turner MG, Gardner RH, O’Neill RV (2001) Landscape ecology in theory and practice. Springer-Verlag, New YorkGoogle Scholar
  46. Whittaker RH (1967) Gradient analysis of vegetation. Biol Rev Camb Philos Soc 42:207–264. doi:10.1111/j.1469-185X.1967.tb01419.x PubMedCrossRefGoogle Scholar

Copyright information

© US Government 2009

Authors and Affiliations

  1. 1.The Nature Conservancy – Rocky Mountain Conservation RegionFort CollinsUSA
  2. 2.Department of AgricultureRocky Mountain Research Station, US Forest ServiceMissoulaUSA

Personalised recommendations