Skip to main content

Advertisement

Log in

Machine-learning models, cost matrices, and conservation-based reduction of selected landscape classification errors

  • Perspective Article
  • Published:
Landscape Ecology Aims and scope Submit manuscript

Abstract

Context

Use of statistical models developed with machine-learning algorithms is increasing in the ecological sciences, yet these disciplines have not capitalized on the ability to use cost matrices to selectively reduce classification errors that have highly detrimental consequences.

Objectives

Our aim was to promote such applications by demonstrating the process of using a cost matrix to decrease specific types of misclassification, explaining the importance of exploring the effectiveness of cost matrices for a given dataset, and encouraging use of cost matrices with machine-learning models in landscape-ecological and conservation contexts.

Methods

Bird occurrence data, landscape and regional land-cover data, costs of false-positive and false-negative errors, and the C5.0 decision tree algorithm were used to train and test a binary classifier.

Results

Increasing the cost for false negatives tended to decrease the frequency of this error type while allowing for reasonable predictive performance for each class separately and both classes combined.

Conclusions

Cost matrices are applicable to many different categorical response variables and spatial scales. We encourage landscape ecologists and planners to explore the effectiveness of cost matrices for their particular dataset and project goals, especially when conservation of biodiversity across broad spatial extents is at stake.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

The metadata, data, and R code used in this research are included in this article’s electronic supplementary material files.

References

  • Bhattacharya M (2013) Machine learning for bioclimatic modelling. Int J Adv Comput Sci Appl 4(2):1–8

    Article  CAS  Google Scholar 

  • Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Chapman and Hall, New York

    Google Scholar 

  • Fielding AH (2002) What are the appropriate characteristics of an accuracy measure? In: Scott JM, Heglund PJ, Morrison ML, Haufler JB, Raphael MG, Wall WA, Samson FB (eds) Predicting species occurrences: issues of accuracy and scale. Island Press, Washington, DC, pp 271–280

    Google Scholar 

  • Fielding AH (2007) Cluster and classification techniques for the biosciences. Cambridge University Press, New York

    Google Scholar 

  • Gergel SE, Stange Y, Coops NC, Johansen K, Kirby KR (2007) What is the value of a good map? An example using high spatial resolution imagery to aid riparian restoration. Ecosystems 10:688–702

    Article  Google Scholar 

  • Gutzwiller KJ, Riffell SK, Flather CH (2015) Avian abundance thresholds, human-altered landscapes, and the challenge of assemblage-level conservation. Landsc Ecol 30:2095–2110

    Article  Google Scholar 

  • Hollmén J, Skubacz M, Taniguchi M (2000) Input dependent misclassification costs for cost-sensitive classifiers. In: Ebecken N, Brebbia C (eds) Data Mining II—Proceedings of the Second International Conference on Data Mining. WIT Press, Ashurst Lodge, Southampton, UK, pp 495–503

  • Humphries GRW, Huettmann F (2018) Machine learning in wildlife biology: algorithms, data issues and availability, workflows, citizen science, code sharing, metadata and a brief historical perspective. In: Humphries GRW, Magness DR, Huettmann F (eds) Machine learning for ecology and sustainable natural resource management. Springer Nature Switzerland, Cham, pp 3–26

    Chapter  Google Scholar 

  • Kuhn M, Johnson K (2016) Applied predictive modeling. Springer, New York

    Google Scholar 

  • Lantz B (2015) Machine learning with R, 2nd edn. Packt Publishing, Birmingham, UK

    Google Scholar 

  • López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141

    Article  Google Scholar 

  • Lynn H, Mohler CL, DeGloria SD, McCulloch CE (1995) Error assessment in decision-tree models applied to vegetation analysis. Landsc Ecol 10:323–335

    Article  Google Scholar 

  • Olden JD, Lawler JJ, Poff NL (2008) Machine learning methods without tears: a primer for ecologists. Q Rev Biol 83:171–193

    Article  Google Scholar 

  • Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106

    Google Scholar 

  • R Core Development Team (2017) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

    Google Scholar 

  • Sauer JR, Niven DK, Hines JE, Ziolkowski DJ, Pardieck KL, Fallon JE, Link WA (2017) The North American Breeding Bird Survey, results and analysis 1966–2015. Version 2.07.2017—USGS Patuxent Wildlife Research Center, Laurel, Maryland. https://www.mbr-pwrc.usgs.gov/bbs/

  • Sun Y, Kamel MS, Wong AKC, Wang Y (2007) Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn 40:3358–3378

    Article  Google Scholar 

Download references

Acknowledgements

We thank J. Stoklosa for comments about an earlier version of the manuscript and Baylor University for supporting this research.

Funding

The authors’ work on this project was supported by funding from Baylor University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kevin J. Gutzwiller.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

The authors declare that they are in full compliance with all of the ethical standards for publishing in Landscape Ecology. Data obtained from the website for the North American Breeding Bird Survey involved birds, but the authors’ research did not involve actual interaction with birds, other animals, or human subjects.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gutzwiller, K.J., Chaudhary, A. Machine-learning models, cost matrices, and conservation-based reduction of selected landscape classification errors. Landscape Ecol 35, 249–255 (2020). https://doi.org/10.1007/s10980-020-00969-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10980-020-00969-y

Keywords

Navigation