Skip to main content

‘Batteries’ in Machine Learning: A First Experimental Assessment of Inference for Siberian Crane Breeding Grounds in the Russian High Arctic Based on ‘Shaving’ 74 Predictors


The Siberian crane (Leucogeranus leucogeranus,) remains an elusive but highly regarded species of global conservation concern. Breeding regions occur in the Russian high arctic, and two subpopulations are known. Here we present for the first time a machine learning-based summer habitat analysis using nesting data for the eastern population in the breeding grounds employing predictive modeling with 74 GIS predictors. There is a typical desire for parsimony to help increase interpretability of models, but findings generally show that it would not result in greatest improvement to the model and inference. ‘Batteries’ are a new concept in machine learning allowing to test a set of experiments that help to test on predictors and model selection. Here we show 28 of those ‘batteries’ and compared multiple approaches to model runs from iteratively dropping the least or most important predictor (‘variable shaving’) to allow all predictors to contribute. It was found that the generic ‘kitchen sink’ model with TreeNet (an optimized boosting algorithm from Salford Systems Ltd) performs best. However, while the use of ‘batteries’ remain widely underused in wildlife conservation management, ‘shaving’ was of great use to learn about the structure, role and impacts of predictors and their spatial performance supporting non-parsimonious work. Of great interest is the finding that a bundle of low-ranked predictors performs almost equal to, or better than, the so-called top predictors. This is called ‘Predictor swapping’. This is the best and most detailed habitat study and prediction for the Siberian crane in summer, thus far. It is to be used for conservation management and as a generic template for any species while data availability and the environmental crisis are on the rise, specifically for the high Arctic.


  • Eastern population Siberian crane (Leucogeranus leucogeranus)
  • Nesting areas
  • Russian high arctic
  • Machine learning
  • Batteries (‘Shaving’)
  • Predictor swapping

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-96978-7_8
  • Chapter length: 22 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   169.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-96978-7
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Hardcover Book
USD   219.99
Price excludes VAT (USA)
Fig. 8.1
Fig. 8.2
Fig. 8.3
Fig. 8.4
Fig. 8.5


  • Akaike H (1974) A new look at the statistical model identification. IEEE Trans Automat Contr AC-19:716–23, Institute of Statistical Mathematics, Minato-ku, Tokyo, Japan

    Google Scholar 

  • Arnold TW (2010) Uninformative parameters and model selection using Akaike’s information criterion. J Wildl Manag 74:1175–1178

    CrossRef  Google Scholar 

  • Barbet-Massin M, Jiguet F, Albert CH, Thuiller W (2012) Selecting pseudo-absences for species distribution models: how, where and how many? Methods Ecol Evol, 3:327–338.

    CrossRef  Google Scholar 

  • BirdLife International (2001) Threatened birds of Asia: the bird life international red data book, vol 1. Bird Life International Cambridge, Cambridge

    Google Scholar 

  • Breiman L (2001) Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci 16:199–231

    CrossRef  Google Scholar 

  • Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca Raton

    Google Scholar 

  • Burnham KP, Anderson DR (2002) Model selection and multimodel inference: a practical information-theoretic approach. Springer, New York

    Google Scholar 

  • Cai T, Huettmann F, Guo Y (2014) Using stochastic gradient boosting to infer stopover habitat selection and distribution of hooded cranes Grus monacha during spring migration in lindian, Northeast China. PLoS ONE 9.

  • Chamberlin TC (1890) The method of multiple working hypotheses. Science 15:92–96

    Google Scholar 

  • Elith J, Graham CH, Anderson RP, Dudík M, Ferrier S, Guisan A, Hijmans RJ, Huettmann F, Leathwick JR, Lehmann A, Li J, Lohmann LG, Loiselle BA, Manion G, Moritz C, Nakamura M, Nakazawa Y, McC J, Overton M, Townsend Peterson A, Phillips SJ, Richardson K, Scachetti-Pereira R, Schapire RE, Soberón J, Williams S, Wisz MS, Zimmermann NE (2006) Novel methods improve prediction of species’ distributions from occurrence data. Ecography 29:129–151

    CrossRef  Google Scholar 

  • Fielding A (1999) Machine learning methods for ecological applications. Springer, Boston

    CrossRef  Google Scholar 

  • Fielding A, Bell JF (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ Conserv 24:38–49

    CrossRef  Google Scholar 

  • Friedman JH (2001) Greedy function approximation: A gradient boosting machine. Ann Stat 29:1189–1232

    CrossRef  Google Scholar 

  • Friedman JH (2002) Stochastic gradient boosting. Comp Stat Data Anal 38:367–378

    CrossRef  Google Scholar 

  • Guthery FS, Brennan LA, Peterson MJ, Lusk LL (2005) Information theory in wildlife science: critique and viewpoint. J Wildl Manag 69:457–465

    CrossRef  Google Scholar 

  • Han X, Guo Y, Mi C, Huettmann F, Wen L (2017) Machine learning model analysis of breeding habitats for the Blacknecked Crane in Central Asian Uplands under Anthropogenic pressures. Scientific Reports 7, Article number: 6114.

  • Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New York

    Google Scholar 

  • Herrick KA, Huettmann F, Lindgren MA (2013) A global model of avian influenza prediction in wild birds: The importance of northern regions. Vet Res.

    CrossRef  Google Scholar 

  • Hilborn R, Mangel M (1997) The ecological detective: Confronting models with data. Princeton University Press, Princeton

    Google Scholar 

  • Hochachka W, Caruana R, Fink D, Munson A, Riedewald M, Sorokina D, Kelling S (2007) Data mining for discovery of pattern and process in ecological systems. J Wildl Manag 71:2427–2437

    CrossRef  Google Scholar 

  • Jiao S, Guo Y, Huettmann F, Lei G (2014) Nest-Site selection analysis of hooded crane (Grus monacha) in northeastern china based on a multivariate ensemble model. Zool Sci 31:430–437

    CrossRef  Google Scholar 

  • Kandel K, Huettmann F, Suwal MK, Regmi GR, Nijman V, Nekaris KAI, Lama ST, Thapa A, Sharma HP, Subedi TR (2015) Rapid multi-nation distribution assessment of a charismatic conservation species using open access ensemble model GIS predictions: red panda (Ailurus fulgens) in the Hindu-Kush Himalaya region. Biol Conserv 181:150–161

    CrossRef  Google Scholar 

  • Kanai Y, Ueta M, Germogenov N, Nagendran M, Mita N, Higuchi H (2002) Migration routes and important resting areas of Siberian cranes (Grus leucogeranus) between northeastern Siberia and China as revealed by satellite tracking. Biol Conserv 106:339–346

    CrossRef  Google Scholar 

  • Klein DR, Magomedova M (2003) Industrial development and wildlife in arctic ecosystems: Can learning from the past lead to a brighter future? In: Rasmussen RO, Koroleva NE (eds) Social and environmental impacts in the North. Kluwer Academic Publishers, The Netherlands, pp 35–56

    Google Scholar 

  • Mace G, Cramer W, Diaz S, Faith DP, Larigauderie A, Le Prestre P, Palmer M, Perrings C, Scholes RJ, Walpole M, Walter BA, Watson JEM, Mooney HA (2010) Biodiversity targets after 2010. Env Sustain 2:3–8

    Google Scholar 

  • Manly FJ, McDonald LL, Thomas DL, McDonald TL, Erickson WP (2002) Resource selection by animals: statistical design and analysis for field studies, Second edn. Kluwer Academic Publishers, Netherlands

    Google Scholar 

  • Matthiessen P (2001) The birds of heaven. Travels with cranes. North Point Press, New York

    Google Scholar 

  • McGarical K, Cushman S, Stafford S (2000) Multivariate statistics for wildlife and ecology research. Springer, New York

    CrossRef  Google Scholar 

  • Mi C, Huettmann F, Guo Y, Han X, Wen L (2017) Why choose random forest to predict rare species distribution with few samples in large undersampled areas? Three Asian crane species models provide supporting evidence. PeerJ.

    CrossRef  Google Scholar 

  • Moore GS, Ilyashenko E (2009) Regional flyway education programs: increasing public awareness of crane conservation along the crane flyways of Eurasia and North America. In: Prentice C (ed) Conservation of flyway wetlands in East and West/Central Asia. Proceedings of the project completion workshop of the UNEP/GEF Siberian Crane wetland project, 14–15 October 2009, Harbin, China. Baraboo (Wisconsin), USA: International Crane Foundation

    Google Scholar 

  • Mueller JP, Massaron L (2016) Machine learning for dummies. For Dummies Publisher, 435 p

    Google Scholar 

  • Ohse B, Huettmann F, Ickert-Bond S, Juday G (2009) Modeling the distribution of white spruce (Picea glauca) for Alaska with high accuracy: an open access role-model for predicting tree species in last remaining wilderness areas. Polar Biol 32:1717–1724

    CrossRef  Google Scholar 

  • Prentice C (ed) (2010) Conservation of flyway wetlands in East and West/Central Asia. Proceedings of the project completion workshop of the UNEP/GEF Siberian Crane wetland project, 14–15 October 2009, Harbin, China. Baraboo (Wisconsin), USA: International Crane Foundation

    Google Scholar 

  • Sorokin AG, Kotyukov YV (1987) Discovery of the nesting ground of the Ob River population of the Siberian Crane. In: Archibald GW, Pasquier RF (eds) Proceedings of the 1983 international crane workshop. International Crane Foundation, Baraboo, pp 209–212

    Google Scholar 

  • Sorokin A, Markin Y (1996) New nesting site of Siberian Cranes. Newsletter of Russian Bird Conservation Union, Moscow

    Google Scholar 

  • Spiridonov V, Gavrilo M, Krasnov MA, Nikolaeva N, Sergienko L, Popov A, Krasnova E (2011) Toward the new role of marine and coastal protected areas in the arctic: The russian case. In: Huettmann F (ed) Protection of the three poles. Springer, New York

    Google Scholar 

  • Silvy NY (2012) The wildlife techniques manual: research and management, vol 2, 7th edn. John Hopkins University Press, Baltimore

    Google Scholar 

  • Van Impe J (2013) Esquisse de l’avifaune de la Sibérie Occidentale: Une revue bibliographique. Alauda 81:269–296

    Google Scholar 

  • Wu G, Leeuw J, Skidmore AK, Prins HHT, Best EPH, Liu Y (2009) Will the three gorges dam affect the underwater light climate of Vallisneria spiralis L. and food habitat of Siberian Crane in Poyang Lake. Hydrobiologia 623:213–222

    CrossRef  Google Scholar 

  • Yu C, Yinghao W, Qing Y (2008) Ground survey of waterbirds in the Poyang Lake region in Winter 2007/2008. Siberian Crane Flyway News: 15

    Google Scholar 

Download references


We thank Dan Steinberg and Salford Systems Ltd. for a workshop with U.S. IALE at Snowbird, Utah, to introduce us to the power of batteries. FH acknowledges the kind and long collaboration with the Forestry University of Beijing, China, and the use of their data. U.S. IALE and S. Linke, C. Cambu, H. Hera, H. Berrios Alvarez and the -EWHALE lab- at UAF, are thanked for their support. This is EWHALE lab publication #185.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Falk Huettmann .

Editor information

Editors and Affiliations


Appendix 1: Details of 74 GIS Environmental layers Used in the Model Prediction (+ 3 Additional Internal Columns)


Name and abbreviation of GIS layer




Monthly mean temperature


These are standard layers used for GIS modeling


Monthly minimum temperature


(see above)


Monthly maximum temperature

tmax 1–12

(see above)


Monthly precipitation


(see above)




(see above)



(see above)



(see above)



(see above)




Herrick et al. (2013)

Several of global landcover layers exist


Human infrastructure index


Herrick et al. (2013)

Human footprint. Several human footprint layers


Distance to waterbody/lake


Mi unpublished

While essential for cranes, this layer is unlikely to be very accurate due to the huge and ephemeral wetlands worldwide


Distance to coastline


Mi unpublished

Relies on the coastline map resolution


x coordinate


Not often used in most GIS model work but important for geo-referencing


y coordinate


Not often used in most GIS model work but important for geo-referencing


Row index



Not often used in most GIS model work but important for row identification

Appendix 2

1.1 List of Top 20 Predictors, as identified by TreeNet ranking


Relative Importance











Distance to lake






























Appendix 3

1.1 Prediction Model Details for the Best Performing Model (the ‘Kitchen sink model’ with 74 predictors)

Siberian crane with a battery run on TreeNet (SPM7) balanced

The kitchensink model, all 74 environmental predictors

figure a

Frequency of Prediction Relative Index of Ocurrence (RIO 0-1) for known presence (1)

figure b
figure c

Appendix 4

(For Prediction map 1 for the ‘Kitchen sink model ’ see Fig. 8.4 in the text; for map legends please see this figure; same for all other appendix maps)

(For Prediction map 2 for the ‘TMax12 model’ see Fig. 8.5 in the text)

1.1 Prediction Map 3 for the ‘BIO14 model’

figure d

1.2 Prediction Map 4 for the ‘TMax12BIO14 model’

figure e

1.3 Prediction Map 5 for the ‘Top5 model’

figure f

1.4 Prediction Map 6 for the ‘Top10 model’

figure g

1.5 Prediction Map 7 for the ‘Top29 model’

figure h

1.6 Prediction Map 8 for the ‘Top35 model’

figure i

1.7 Prediction Map 9 for the ‘Bottom 44 model’

figure j

1.8 Prediction map 10 for the ‘Leaving out top 3 interacting predictors model’

figure k

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Huettmann, F., Mi, C., Guo, Y. (2018). ‘Batteries’ in Machine Learning: A First Experimental Assessment of Inference for Siberian Crane Breeding Grounds in the Russian High Arctic Based on ‘Shaving’ 74 Predictors. In: Humphries, G., Magness, D., Huettmann, F. (eds) Machine Learning for Ecology and Sustainable Natural Resource Management. Springer, Cham.

Download citation