Skip to main content
Log in

Gradient modeling of conifer species using random forests

  • Research Article
  • Published:
Landscape Ecology Aims and scope Submit manuscript

Abstract

Landscape ecology often adopts a patch mosaic model of ecological patterns. However, many ecological attributes are inherently continuous and classification of species composition into vegetation communities and discrete patches provides an overly simplistic view of the landscape. If one adopts a niche-based, individualistic concept of biotic communities then it may often be more appropriate to represent vegetation patterns as continuous measures of site suitability or probability of occupancy, rather than the traditional abstraction into categorical community types represented in a mosaic of discrete patches. The goal of this paper is to demonstrate the high effectiveness of species-level, pixel scale prediction of species occupancy as a continuous landscape variable, as an alternative to traditional classified community type vegetation maps. We use a Random Forests ensemble learning approach to predict site-level probability of occurrence for four conifer species based on climatic, topographic and spectral predictor variables across a 3,883 km2 landscape in northern Idaho, USA. Our method uses a new permutated sample-downscaling approach to equalize sample sizes in the presence and absence classes, a model selection method to optimize parsimony, and independent validation using prediction to 10% bootstrap data withhold. The models exhibited very high accuracy, with AUC and kappa values over 0.86 and 0.95, respectively, for all four species. The spatial predictions produced by the models will be of great use to managers and scientists, as they provide vastly more accurate spatial depiction of vegetation structure across this landscape than has previously been provided by traditional categorical classified community type maps.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Austin MP (2002) Spatial prediction of species distribution: an interface between ecological theory and statistical modelling. Ecol Model 157:101–118

    Article  Google Scholar 

  • Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    Google Scholar 

  • Brieman L (2001a) Statistical modeling: the two cultures. Stat Sci 16(3):199–231. doi:10.1214/ss/1009213726

    Article  Google Scholar 

  • Breiman L (2001b) Random forests. Mach Learn 45:5–32. doi:10.1023/A:1010933404324

    Article  Google Scholar 

  • Chavez PS (1988) An improved dark-object subtraction technique for atmospheric scattering correction of multispectral data. Remote Sens Environ 24:459–479. doi:10.1016/0034-4257(88)90019-3

    Article  Google Scholar 

  • Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEboost: improving prediction of the minority class in boosting. In: 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp 107–119

  • Chen C, Liaw A, Breiman L (2004) Using random forest to learn imbalanced data. http://oz.berkeley.edu/users/chenchao/666.pdf

  • Crookston NL, Finley AO (2008) yaImpute: an R package for kNN imputation. J Stat Softw 23:1–16

    Google Scholar 

  • Curtis JT, McIntosh RP (1951) An upland forest continuum in the prairie-forest border region of Wisconsin. Ecol Monogr 32:476–496

    Google Scholar 

  • Cushman SA, McKenzie D, Peterson DL, Littell J, McKelvey KS (2007) Research agenda for integrated landscape modelling. USDA Forest Service General Technical Report RMRS-GTR-194

  • Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT, Gibson J, Lawler J (2007) Random forests for classification in ecology. Ecology 88:2783–2792. doi:10.1890/07-0539.1

    Article  PubMed  Google Scholar 

  • Déath G, Fabricius KE (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81:3178–3192

    Article  Google Scholar 

  • DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing the area under tow or more correlated receiver operating characteristics curves: a nonparametric approach. Biometrics 59:837–845. doi:10.2307/2531595

    Article  Google Scholar 

  • Evans IS (1972) General geomorphometry, derivatives of altitude, and descriptive statistics. In: Chorley RJ (ed) Spatial analysis in geomorphology. Harper & Row, New York, pp 17–90

    Google Scholar 

  • Forman RTT (1995) Land mosaics: the ecology of landscapes and regions. Cambridge University Press, Cambridge

    Google Scholar 

  • Forman RTT, Godron M (1986) Landscape ecology. John Wiley & Sons, New York

    Google Scholar 

  • Freeman EA, Moisen G (2008) Presence absence: an R package for presence absence analysis. J Stat Softw 23(11):31

    Google Scholar 

  • Fu P, Rich PM (1999) Design and implementation of the solar analyst: an ArcView extension for modeling solar radiation at landscape scales. Proceedings of the 19th Annual ESRI User Conference, San Diego, USA, http://www.esri.com/library/userconf/proc99/proceed/papers/pap867/p867.htm

  • Gleason HA (1926) The individualistic concept of the plant association. Bull Torrey Bot Club 53:7–26. doi:10.2307/2479933

    Article  Google Scholar 

  • Guisan A, Zimmermann NE (2000) Predictive habitat distribution model in ecology. Ecol Model 135:147–186. doi:10.1016/S0304-3800(00)00354-9

    Article  Google Scholar 

  • Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning. Springer, New York

    Google Scholar 

  • Hutchinson GE (1957) Concluding remarks. Cold Spring Harb Symp Quant Biol 22:415–427

    Google Scholar 

  • Liaw A, Wiener M (2002) Classification and regression by random forest. R news: the newsletter of the R project (http://cran.r-project.org/doc/Rnews/) 2(3):18–22

  • Manning AD, Lindenmayer DB, Nix HA (2004) Continua and umwelt: novel perspectives on viewing landscapes. Oikos 104:621–628. doi:10.1111/j.0030-1299.2004.12813.x

    Article  Google Scholar 

  • McCune B, Keon D (2002) Equations for potential annual direct incident radiation and heat load index. J Veg Sci 13:603–606. doi:10.1658/1100-9233(2002)013[0603:EFPADI]2.0.CO;2

    Article  Google Scholar 

  • McGarigal K, Cushman SA (2005) The gradient concept of landscape structure. In: Wiens J, Moss M (eds) Issues and perspectives in landscape ecology. Cambridge University Press, Cambridge, pp 112–119

    Google Scholar 

  • McGarigal K, Tagil S, Cushman SA (2009) Surface metrics: an alternative to patch metrics for the quantification of landscape structure. Landscape Ecol 24:433–450

    Article  Google Scholar 

  • McIntyre S, Barrett GW (1992) Habitat variegation, an alternative to fragmentation. Conserv Biol 6:146–147. doi:10.1046/j.1523-1739.1992.610146.x

    Article  Google Scholar 

  • Moore ID, Gessler PE, Nielsen GA, Petersen GA (1993) Terrain attributes: estimation methods and scale effects. In: Jakeman AJ, Beck MB, McAleer M (eds) Modeling change in environmental systems. Wiley, London, pp 189–214

    Google Scholar 

  • Morrison D (2002) Multivariate statistical methods. McGraw-Hill series in probability and statistics, 4th edn. McGraw-Hill, New York

    Google Scholar 

  • Murphy MA, Evans JS, Storfer AS (2009) Quantifying Bufo boreas connectivity in Yellowstone National Park with landscape genetics. Ecology (in press)

  • Nemani R, Pierce L, Running S, Brand L (1993) Forest ecosystem processes at the watershed scale; sensitivity to remotely-sensed leaf-area index estimates. Int J Remote Sens 14:2519–2534. doi:10.1080/01431169308904290

    Article  Google Scholar 

  • Park Y-S, Chon T-S (2006) Biologically inspired machine learning implemented to ecological informatics. Ecol Inform 203:1–7

    Google Scholar 

  • Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecol Model 190:231–259. doi:10.1016/j.ecolmodel.2005.03.026

    Article  Google Scholar 

  • Prasad AM, Iverson LR, Liaw A (2006) Random forests for modeling the distribution of tree abundances. Ecosystems (N Y, Print) 9:181–199. doi:10.1007/s10021-005-0054-1

    Article  Google Scholar 

  • R Development Core Team (2007) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org

  • Rabus B, Eineder M, Roth A, Bamler R (2003) The shuttle radar topography mission—a new class of digital elevation models acquired by spaceborne radar. Photogramm Eng Remote Sens 57:241–262. doi:10.1016/S0924-2716(02)00124-7

    Article  Google Scholar 

  • Rehfeldt GE, Crookston NL, Warwell MV, Evans JS (2006) Empirical analyses of plant–climate relationships for the western United States. Int J Plant Sci 167(6):1123–1150. doi:10.1086/507711

    Article  Google Scholar 

  • Roberts DW, Cooper SV (1989) Concepts and techniques of vegetation mapping, land classifications based on vegetation: applications for resource management. GTR-INT-257, USDA Forest Service Intermountain Research Station, Ogden, UT, pp 90–96

  • Schapire R (1990) Strength of weak learnability. J Mach Learn 5:197–227

    Google Scholar 

  • Stage AR (1976) An expression of the effects of aspect, slope, and habitat type on tree growth. For Sci 22(3):457–460

    Google Scholar 

  • Stockwell DRB, Peters DP (1999) The GARP modeling system: problems and solutions to automated spatial prediction. Int J Geogr Inf Syst 13:143–158. doi:10.1080/136588199241391

    Article  Google Scholar 

  • Svetnik V, Liaw A, Tong C, Wang T (2004) Application of Breiman’s random forest to modeling structure–activity relationships of pharmaceutical molecules. In: Roli F, Kittler J, Windeatt T (eds) Lecture notes in computer science, vol 3077. Springer, Berlin, pp 334–343

    Google Scholar 

  • Tarboton DG (1997) A new method for the determination of flow directions and contributing areas in grid digital elevation models. Water Resour Res 33(2):309–319. doi:10.1029/96WR03137

    Article  Google Scholar 

  • Turner MG, Gardner RH, O’Neill RV (2001) Landscape ecology in theory and practice. Springer-Verlag, New York

    Google Scholar 

  • Whittaker RH (1967) Gradient analysis of vegetation. Biol Rev Camb Philos Soc 42:207–264. doi:10.1111/j.1469-185X.1967.tb01419.x

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

We would like to thank M. Murphy, W. Godsoe, J. Rehfeldt, R. Dezzani, N. Crookston, and J. Kiesecker for helpful discussion of methods and concepts presented in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samuel A. Cushman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Evans, J.S., Cushman, S.A. Gradient modeling of conifer species using random forests. Landscape Ecol 24, 673–683 (2009). https://doi.org/10.1007/s10980-009-9341-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10980-009-9341-0

Keywords

Navigation