Random Forests Applied as a Soil Spatial Predictive Model in Arid Utah

Part of the Progress in Soil Science book series (PROSOIL, volume 2)


We sought to predict soil classes by applying random forests (RF), a decision tree analysis, to predict 24 soil classes across an arid watershed of western Utah. Environmental covariates were derived from Landsat 7 Enhanced Thematic Mapper Plus (ETM+) and digital elevation models (DEM). Random forests are similar to classification and regression trees (CART). However, RF is doubly random. Many (e.g., 500) weak trees are grown (trained) independently because each tree is trained with a new randomly selected bootstrap sample, and a random subset of variables is used to split each node. To train and validate the RF trees, 561 soil descriptions were made in the field. An additional 111 points were added by case-based reasoning using photo interpretation. As RF makes classification decisions from the mode of many independently grown trees, model uncertainty can be derived. Furthermore, the probability that a pixel belongs to one or more classes in the legend can be determined. The overall out of the bag (OOB) error for discrete classes was 55.2%. The confusion matrix revealed that four soils that frequently co-occurred on landforms were frequently misclassified as each other. These soils were combined into six soil map units. To identify pixels that might belong to one of these newly created combinations of soil classes, minimum threshold probabilities were set. Employing probability by class can be an effective and objective method of determining membership in soil map unit associations and complexes mapped at the 1:24,000 scale.


Soil components Soil map units Digital soil mapping Digital elevation model Satellite imagery 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bailey, T.C., and Gatrell, A.C., 1995. Interactive Spatial Data Analysis. Prentice Hall, Harlow England.Google Scholar
  2. Breiman, L., 2001. Random forests. Machine Learning 45:5–32.CrossRefGoogle Scholar
  3. Breiman, L., and Cutler, A., 2009. Random forests homepage. (last verified 14 April 2010).
  4. Chavez, P.S., Jr., 1996. Image-based atmospheric corrections – revisited and improved. Photogrammetric Engineering & Remote Sensing 62:1025–1036.Google Scholar
  5. Chen, C., Liaw, A., and Breiman, L., 2004. Using random forests to learn unbalanced data. Technical Report 666, Statistics Department, University of California at Berkeley. [Online] Available: (last verified 14 April 2010).
  6. ESRI GIS and Mapping Software, 2007. ArcGIS 9.2. ESRI, Redlands, CA.Google Scholar
  7. Gislason, P.O., Benediktsson, J.A., and Sveinsson, J.R., 2006. Random forests for land cover classification. Pattern Recognition Letters 27:294–300.CrossRefGoogle Scholar
  8. Gesch, D.B., 2007. Chapter 4 – The national elevation dataset, pp. 99–119. In: Maune, D.F. (ed.), Digital Elevation Model Technologies and Applications: The DEM Users Manual, 2nd ed., American Society for Photogrammetry and Remote Sensing, Bethesda, MD.Google Scholar
  9. Jenny, H., 1941. Factors of Soil Formation. McGraw-Hill, New York, NY.Google Scholar
  10. Leica Geosystems, 2006. ERDAS Imagine V. 9.1. Leica Geosystems, Atlanta, GAGoogle Scholar
  11. McBratney, A.B., Mendonça Santos, M.L., and Minasny, B., 2003. On digital soil mapping. Geoderma 117:3–52.CrossRefGoogle Scholar
  12. Moran, C.J., and Bui, E.N., 2002. Spatial data mining for enhancing soil map modeling. International Journal of Geographical Information Science 16(6):533–549.CrossRefGoogle Scholar
  13. Nield, S.J., Boettinger, J.L., and Ramsey, R.D., 2007. Digitally mapping gypsic and natric soil areas using Landsat ETM data. Soil Science Society of America Journal 71:245–252.Google Scholar
  14. RSGIS, 2003. Remote Sensing and Geographic Information Systems Laboratory, Utah State University, (last verified 13 August 2009).
  15. Salford Systems, 2004. Random Forests V. 1.0. Salford Systems, San Diego, CA.Google Scholar
  16. Shi, X., Zhu, X-A., Burt, J.E., Qi, F., and Simonson, D., 2004. A case-based reasoning approach to fuzzy mapping. Soil Science Society of America Journal 68:885–894.Google Scholar
  17. Soil Survey Staff, 2006. Keys to Soil Taxonomy, 10th ed., U.S. Department of Agriculture, Natural Resources Conservation Service, Washington, DC, pp. 26–28. [Online] Available: (last verified 14 April 2010).
  18. U.S. Department of Agriculture, Natural Resources Conservation Service, 2007. National Soil Survey Handbook, title 430-VI sec. 627.03.d. [Online] Available: (last verified 14 April 2010).
  19. Utah GIS Portal, 2009. National Elevation Dataset available for Utah, (last verified 13 August 2009).

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  1. 1.USDA Natural Resources Conservation ServiceRichfieldUSA
  2. 2.Department of Plants, Soils, and ClimateUtah State UniversityLoganUSA
  3. 3.Department of Watershed ScienceUtah State UniversityLoganUSA
  4. 4.Department of Wildland ResourcesUtah State UniversityLoganUSA

Personalised recommendations