Abstract
The application of machine learning algorithms in ecology has surged forward in the past two decades. More and more, we are seeing innovative and interesting uses of these sophisticated algorithms which are driving inference and understanding in natural resource management. The concept behind machine learning is to provide data to a computer and allow the machine to ‘learn’ the patterns in those data. These learned relationships are applied and analysed in a variety of ways from clustering to prediction. In ecology and natural resource management, these methods are not well taught in classroom settings, which is leading to a major disconnect between ecologists and the latest analytical techniques. In this chapter, we introduce machine learning with a focus on ecological and natural resource management applications. We provide definitions and a list of a few key algorithms that are becoming commonplace in the analysis of wildlife data. We further introduce a few broad concepts (i.e., data sharing, metadata and citizen science) in the sphere of ecological sciences. The ideas presented here will help us to better understand how to apply machine learning for conservation, management, and academic studies. These examples can also be used to teach the next generation of scientists how to best use machine learning algorithms in their own work. Our broader goals here, and elsewhere in this book, is to promote a holistic understanding of our planet through algorithms that can handle many interacting covariates that represent how the world really works.
Keywords
- Machine learning
- Data mining
- Ecology
- Open access data
- Algorithms
This is a preview of subscription content, access via your institution.
Buying options


Notes
- 1.
This work was carried out with the Global Primate Network in Nepal, namely Ganga Ram Regmi, Madan Krishna Suwal, Dikpal Krishna Karmacharya, Kamal Kandel and Sonam Tashi Lama.
References
Alexander JC (2013) The dark side of modernity. Polity Publishers, New York
Armitage DW, Ober HK (2010) A comparison of supervised learning techniques in the classification of bat echolocation calls. Ecol Info 5(6):465–473
Boston AN, Stockwell DRB (1995) Interactive species distribution reporting, mapping and modelling using the world wide web. Computer Networks and ISDN Systems 28:231–228
Breiman L (2001a) Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci 16:199–231
Breiman L (2001b) Random forests. Mach Learn 45:5–32
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Taylor & Francis, New York
Craig E, Huettmann F (2008) Using “blackbox” algorithms such as Tree Net and random forests for data-mining and for finding meaningful patterns, relationships and outliers in complex ecological data: an overview, an example using golden eagle satellite data and an outlook for a promising future, Chapter IV. In: Wang H-f (ed) Intelligent data analysis: developing new methodologies through pattern discovery and recovery. IGI Global, Hershey, pp 65–83
Crisci C, Ghattas B, Perera G (2012) A review of supervised machine learning algorithms and their applications to ecological data. Ecol Model 240:113–122
Cushman SA, Huettmann F (2010) Spatial complexity, informatics, and wildlife conservation. Springer, Tokyo
Cutler DR, Edwards TC, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ (2007) Random forests for classification in ecology. Ecology 88(11):2783–2792
De’ath G, Fabricius K (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81(11):3178–3192
De'ath G (2007) Boosted trees for ecological modeling and prediction. Ecology 88(1):243–251
Drew CA, Wiersma Y, Huettmann F (2011) Predictive species and habitat modeling in landscape ecology. Springer, New York
Elith J, Graham CH, Anderson RP, Dudík M, Ferrier S, Guisan A, Hijmans RJ, Huettmann F, Leathwick JR, Lehmann A, Li J, Lohmann LG, Loiselle BA, Manion G, Moritz C, Nakamura M, Nakazawa Y, Overton JMM, Peterson AT, Phillips SJ, Richardson K, Scachetti-Pereira R, Schapire RE, Soberón-Mainero J, Williams S, Zimmermann NE (2006) Novel methods improve prediction of species’ distributions from occurrence data. Ecography 29:129–151
Elith J, Phillips SJ, Hastie T, Dudík M, Chee YE, Yates CJ (2011) A statistical explanation of MaxEnt for ecologists. Divers Distrib 17(1):43–57
Fernández A, García S, Luengo J, Bernadó-Mansilla E, Herrera F (2010) Genetics-based machine learning for rule induction: state of the art, taxonomy, and comparative study. IEEE Trans Evol Comput 14(6):913–941
Fernandez-Delgado M, Cernades E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15:3133–3181
Fielding A (1999) Machine learning methods for ecological applications. Springer, New York
Flemons P, Guralnick R, Krieger J, Ranipeta A, Neufeld D (2007) A web-based GIS tool for exploring the world's biodiversity: the global biodiversity information facility mapping and analysis portal application (GBIF-MAPA). Ecol Inform 2(1):49–60
Friedman JH (2002) Stochastic gradient boosting. Comp Stat & Data Anal 38:367–378
Gill FB (2007) Ornithology, Third edn. W. H. Freeman & Co., New York
Goodwin JD, North EW, Thompson CM (2014) Evaluating and improving a semi-automated image analysis technique for identifying bivalve larvae. Limnol Oceanogr Methods 12(8):548–562
Guisan A, Thuiller W (2005) Predicting species distribution offering more than simple habitat models. Ecol Lett 10:993–1009
Guisan A, Zimmermann NE (2000) Predictive habitat distribution models in ecology. Ecol Model 135:147–186
Hagan MT, Demuth HB, Beale MH, Jesus Od (2014) Neural network design. Martin Hagan, 1012 pp
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer Series in Statistics
Hilborn R, Mangel M (1997) The ecological detective: confronting models with data. Princeton University Press, Princeton
Hochachka W, Caruana R, Fink D, Munson A, Riedewald M, Sorokina D, Kelling S (2007) Data mining for discovery of pattern and process in ecological systems. J Wildl Manag 71:2427–2437
Holland J (1975) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT Press, Cambridge, 211 pp
Hsieh W (2009) Machine learning methods in the environmental sciences. Cambridge University Press, Cambridge, 349 pp
Huettmann F (2007) Modern adaptive management: adding digital opportunities towards a sustainable world with new values. Forum Public Policy: Clim Chang Sustain Dev 3:337–342
Huettmann F (2015) On the relevance and moral impediment of digital data management, data sharing, and public open access and open source code in (tropical) research: the Rio convention revisited towards mega science and best professional research practices. In: Huettmann F (ed) Central American biodiversity: conservation, ecology, and a sustainable future. Springer, New York, pp 391–418
Huettmann F, Ickert-Bond S (2017) On open access, data mining and plant conservation in the circumpolar north with an online data example of the Herbarium, University of Alaska Museum of the North. Arc Sci
Humphries GRW, Huettmann F (2014) Putting models to a good use: a rapid assessment of Arctic seabird biodiversity indicates potential conflicts with shipping lanes and human activity. Divers Distrib 20(4):478–490
Jiao S, Huettmann F, Guoc Y, Li X, Ouyang Y (2016) Advanced long-term bird banding and climate data mining in spring confirm passerine population declines for the northeast Chinese-Russian flyway. Glob Planet Chang 144:17–33
Johan Rockström, Will Steffen, Kevin Noone, Åsa Persson, F. Stuart Chapin, Eric F. Lambin, Timothy M. Lenton, Marten Scheffer, Carl Folke, Hans Joachim Schellnhuber, Björn Nykvist, Cynthia A. de Wit, Terry Hughes, Sander van der Leeuw, Henning Rodhe, Sverker Sörlin, Peter K. Snyder, Robert Costanza, Uno Svedin, Malin Falkenmark, Louise Karlberg, Robert W. Corell, Victoria J. Fabry, James Hansen, Brian Walker, Diana Liverman, Katherine Richardson, Paul Crutzen, Jonathan A. Foley, (2009) A safe operating space for humanity. Nature 461 (7263):472–475
Kandel K, Huettmann F, Suwal MK, Regmi GR, Nijman V, Nekaris KAI, Lama ST, Thapa A, Sharma HP, Subedi TR (2015) Rapid multi-nation distribution assessment of a charismatic conservation species using open access ensemble model GIS predictions: red panda (Ailurus fulgens) in the Hindu-Kush Himalaya region. Biol Conserv 181:150–161
Kotsiantis SB, Zaharakis ID, Pintelas PE (2006) Machine learning: a review of classification and combining techniques. Artif Intell Rev 26(3):159–190
Laplace PS (1986) Memoir on the probability of the causes of events. Stat Sci 1(3):364–378
Mace G, Cramer W, Diaz S, Faith DP, Larigauderie A, Le Prestre P, Palmer M, Perrings C, Scholes RJ, Walpole M, Walter BA, Watson JEM, Mooney HA (2010) Biodiversity targets after 2010. Environ Sustain 2:3–8
McArdle BH (1988) The structural relationship: regression in biology. Can J Zool 66(11):2329–2339
Mi C, Huettmann F, Guo Y, Han X, Wen L (2017) Why to choose random forest to predict rare species distribution with few samples in large undersampled areas? Three Asian crane species models provide supporting evidence. PeerJ 5:e2849
Miller GT, Spoolman SE (2012) Living in the environment. Brooks/Cole Publishers, New York
Moyes CD, Schulte PM (2007) Principles of animal physiology, 2nd edn. Pearson Publishers
Mueller JP, Massaron L (2016) Machine learning for dummies. John Wiley & Sons
O’Connor RJ (2000) Why ecology lags behind biology. Scientist 14(Part 20):35 16 Oct 2000
Olden JD, Lawler JJ, Poff NL (2008) Machine learning methods without tears: a primer for ecologists. Q Rev Biol 83(2):171–193
Peterson AT, Papeş M, Eaton M (2007) Transferability and model evaluation in ecological niche modeling: a comparison of GARP and Maxent. Ecography 30(4):550–560
Petkos G (2003) Applying machine learning techniques to ecological data. M.Sc Dissertation. University of Edinburgh
Phillips SJ, Dudik M (2008) Modelling of species distributions with Maxent: new extensions and a comprehensive evaluation. Ecography 31:161–175
Phillips SA, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecol Model 190:231–259
Phillips SJ, Anderson RP, Dudík M, Schapire RE, Blair ME (2017) Opening the black box: an open-source release of Maxent. Ecography 40:887–893
Primack R (2010) Essentials of conservation biology. Fifth, Sinauer Associates Inc
Recknagel F (2001) Applications of machine learning to ecological modelling. Ecol Model 146(1–3):303–310
Recknagel F, Bobbin J, Whigham P, Wilson H (2002) Comparative application of artificial neural networks and genetic algorithms for multivariate time-series modelling of algal blooms in freshwater lakes. J Hydroinf 4(2):125–133
Rockström J, Steffen W, Noone K, Åsa P, Stuart Chapin F, Lambin EF, Lenton TM, Scheffer M, Folke C, Schellnhuber HJ, Nykvist B, de Wit CA, Hughes T, van der Leeuw S, Rodhe H, Sörlin S, Snyder PK, Costanza R, Svedin U, Falkenmark M, Karlberg L, Corell RW, Fabry VJ, Hansen J, Walker B, Liverman D, Richardson K, Crutzen P, Jonathan A. Foley (2009) A safe operating space for humanity. Nature 461(7263):472–475
Rosa D, Isabel M, Marques AT, Palminha G, Costa H, Mascarenhas M, Fonseca C, Bernardino J (2016) Classification success of six machine learning algorithms in radar ornithology. Ibis 158(1):28–42
Rosales J (2008) Economic growth, climate change, biodiversity loss: distributive justice for the global north and south. Conserv Biol 22:1409–1417
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Samuel AL (1959) Some studies in machine learning using the game of checkers. IBM J Res Dev 3(3):210–229
Sandifer PA, Sutton-Grier AE, Ward BP (2015) Exploring connections among nature, biodiversity, ecosystem services, and human health and well-being: opportunities to enhance health and biodiversity conservation. Ecosyst Serv 12:1–15
Silvy NJ (2012) The wildlife techniques manual: research & management. 2 volumes. The Johns Hopkins University Press; Seventh edition
Stockwell DRB (1994) Genetic Algorithm for Rule-set Production (GARP), ERIN WWW Server http://www.erin.gov.au/general/biodiv_model/ERIN/GARP/home.html
Stockwell DRB (1999) The GARP modelling system: problems and solutions to automated spatial prediction. Int J Geogr Inf Sci 3:143–158
Stockwell DRB, Noble IR (1992) Induction of sets of rules from animal distribution data: a robust and informative method of data analysis. Math Comput Simul 33:385–390
Stone CJ (1984) Classification and regression trees, vol 8. Wadsworth International Group, pp 452–456
Stowell D, Plumbley MD (2014) Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. PeerJ 2:e488
Strobl C, Malley J, Tutz G (2009) An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods 14(4):323
Sutton T, De Giovanni R, Siqueira MF (2007) Introducing open modeller-a fundamental niche modelling framework. OSGeo J 1(1)
Thessen A (2016) Adoption of machine learning techniques in ecology and earth science. One Ecosyst 1:e8
Tulloch AIT, Auerbach N, Avery-Gomm S, Bayraktarov E, Butt N, Dickman CR, Ehmke G, Fisher DO, Grantham H, Holden MH, Lavery TH, Leseberg NP, Nicholls M, O’Connor J, Roberson L, Smyth AK, Stone Z, Tulloch V, Turak E, Wardle GM, Watson JEM (2018) A decision tree for assessing the risks and benefits of publishing biodiversity data. Nat Ecol Evol 2(8):1209–1217
Turing AM (1950) Computing machinery and intelligence. Mind 59(236):433
Valletta JJ, Torney C, Kings M, Thornton A, Madden J (2017) Applications of machine learning in animal behaviour studies. Anim Behav 124:203–220
Verner J, Morrison ML, Ralph CJ (1986) Wildlife 2000: modeling habitat relationships of terrestrial vertebrates. University of Wisconsin Press, Madison
Wackernagel M, Schulz NB, Deumling D, Linares AC, Jenkins M, Kapos V, Monfreda C, Loh J, Myers N, Norgaard R, Randers J (2002) Tracking the ecological overshoot of the human economy. PNAS 99:9266–9271
Watson JE, Darling ES, Venter O, Maron M, Walston J, Possingham HP, Dudley N, Hockings M, Barnes M, Brooks TM (2016) Bolder science needed now for protected areas. Conserv Biol 30(2):243–248
Zuckerberg B, Huettmann F, Friar J (2011) Proper data management as a scientific foundation for reliable species distribution modeling. Chapter 3. In: Drew CA, Wiersma Y, Huettmann F (eds) Predictive species and habitat modeling in landscape ecology. Springer, New York, pp 45–70
Acknowledgements
We would like to thank all the proponents of machine learning algorithms and their use in ecology. The list of people to thank for the discussions in this chapter is too extensive, but without them, our work would not be possible. Special thanks to V Morera and M Garcia Reyes for reviewing this chapter and providing insightful comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Humphries, G.R.W., Huettmann, F. (2018). Machine Learning in Wildlife Biology: Algorithms, Data Issues and Availability, Workflows, Citizen Science, Code Sharing, Metadata and a Brief Historical Perspective. In: Humphries, G., Magness, D., Huettmann, F. (eds) Machine Learning for Ecology and Sustainable Natural Resource Management. Springer, Cham. https://doi.org/10.1007/978-3-319-96978-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-96978-7_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96976-3
Online ISBN: 978-3-319-96978-7
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)