Chinese Science Bulletin

, Volume 59, Issue 32, pp 4323–4331 | Cite as

Obtaining the best possible predictions of habitat selection for wintering Great Bustards in Cangzhou, Hebei Province with rapid machine learning analysis

Article Ecology


Great Bustards (Otis tarda dybowskii) are one of the world’s heaviest flying birds, occupying grassland habitats in Eastern Asia. Our study is located at the most eastern Chinese wintering site in Cangzhou, Hebei Province, where approximately 100 individuals are concentrated in a small area (17.53 km2). Solid information is still lacking about the wintering areas for this subspecies in its eastern range and specifically for China. The study area consists of intensely used farmland in proximity to humans and is lacking conservation areas and wild, open fields. Here, we present our results from two years of field data collection on habitat selection. We choose a machine learning model approach based on a rapid assessment methodology for the winter habitat of the Great Bustard. It is based on a spatial analysis of the best available environmental data, which were collected relatively quickly. These relatively new methods in ecology are based on an ensemble of decision trees and include algorithms such as TreeNet, Random Forest and CART used in parallel. In this study, we collected bustard droppings (presence only) from 48 locations between December 2011 and January 2012 and used the sites as training data. Droppings from 23 locations were collected in November 2012, and those sites were used as test data. We used eight environmental variables as predictor layers for the response variable of bustard presence/availability. We employed a Geographic Information System (ArcGIS 10.1 and Geospatial Modelling Environment) and Google Earth. Compared with the other three models, we found that predictions from Random Forest obtained a significant difference between presence and absence. According to this model, the three most important factors for wintering Great Bustards are distance to residential area, distance to water pools, and farmland area. Our model shows that wintering Great Bustards prefer locations that are over 400 m away from residential areas, within 900 m of water pools and on areas of farmland smaller than 0.5 km2. We think we can apply our analysis to Great Bustard management in our study area and the adjacent region and that this work sets a baseline for future research.


Hebei Province (China) Wintering habitat Great Bustards Predictive modeling Random Forest 



We heartily thank Gao Yun and Liu Min for their help with data collection, the EWHALE lab, Salford Systems Ltd, Monitoring Network ( and all those who have contributed to Great Bustard censuses and their conservation. This work was supported by the National Forestry Bureau of China (1105-LYSJWT-113).


  1. 1.
    Goroshko OA (2010) Present status of population of Great Bustard (Otis tarda dybowskii) in Dauria and other breeding grounds in Russia and Mongolia: distribution, number and dynamics of population, threats, conservation. First International Symposium on Conservation of Great Bustard Forum, BeijingGoogle Scholar
  2. 2.
    Kong Y, Li F (2005) The status and research trends of the Great Bustard. Chin J Zool 40:111–115 (in Chinese)Google Scholar
  3. 3.
    Jiang J (2003) The status of resource and conservation of Great Bustard in China. Master Dissertation, Northeast Forestry University, Harbin (in Chinese)Google Scholar
  4. 4.
    Wu M, Hou J, Gao L et al (2011) The geographical distribution and conservation of Great Bustard in Hebei Province. Sichuan J Zool 30:814–815 (in Chinese)Google Scholar
  5. 5.
    Wang Q, Yan C (2002) The cranes, rails and Bustards of China. Fonghuanggu Bird and Ecology Park, Taiwan (in Chinese)Google Scholar
  6. 6.
    Elder JF IV (2003) The generalization paradox of ensembles. J Comput Graph Stat 12:853–864CrossRefGoogle Scholar
  7. 7.
    Faragó S (1996) Lage des Großtrappenbestandes in Ungarnund Ursachen für den bestandsrückgang. Naturschutz und Landschaf tspflege in Brandenburg 1:12–17Google Scholar
  8. 8.
    Martínez C (1991) Patterns of distribution and habitat selection of a great bustard (Otis tarda) population in northwestern Spain. Ardeola 38:137–147Google Scholar
  9. 9.
    Litzbarski B, Litzbarski H (1996) Zur Situation der Großtrappe Otis tarda in Deutschland. Vogelwelt 117:213–224Google Scholar
  10. 10.
    Suárez F, Naveso M, De Juana E (1997) Farming in the drylands of Spain: birds of the pseudosteppes. Academic Press, LondonGoogle Scholar
  11. 11.
    Yu G, Zou C, Sun X et al (2008) Wintering population of Otis tarda near Dagang area and the ecological observation. Jilin For Sci Technol 37:22–26 (in Chinese)Google Scholar
  12. 12.
    Liu J, Tian X, Zhou J et al (2008) Habitat selection of Great Bustard in Tumuji during winter and spring. J Northeast For Univ 36:56–59 (in Chinese)Google Scholar
  13. 13.
    Derrig RA, Francis LA (2008) Distinguishing the forest from the TREES: a comparison of tree based data mining methods. Variance 2:184–208Google Scholar
  14. 14.
    Breiman L (2001) Random forests. Mach Learn 45:5–32CrossRefGoogle Scholar
  15. 15.
    Salford Systems—TreeNet. Version 2.0 (2002)
  16. 16.
    Breiman L, Friedman J, Olshen R et al (1984) Classification and regression trees. Chapman & Hall/CRC, BelmontGoogle Scholar
  17. 17.
    Nur N, Jahncke J, Herzog MP et al (2011) Where the wild things are: predicting hotspots of seabird aggregations in the California Current System. Ecol Appl 21:2241–2257CrossRefGoogle Scholar
  18. 18.
    Huettmann F, Cushman S (2010) Spatial complexity, informatics, and wildlife conservation. Springer, TokyoGoogle Scholar
  19. 19.
    Prasad AM, Iverson LR, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9:181–199CrossRefGoogle Scholar
  20. 20.
    Hochachka WM, Caruana R, Fink D et al (2007) Data-mining discovery of pattern and process in ecological systems. J Wildl Manag 71:2427–2437CrossRefGoogle Scholar
  21. 21.
    Li X (2013) Using “random forest” for classification and regression. Chin J Appl Entomol 50:1190–1197 (in Chinese)Google Scholar
  22. 22.
    Zhai T, Li X (2012) Climate change induced potential range shift of the crested ibis based on ensemble models. Acta Ecol Sin 32:2361–2370 (in Chinese)CrossRefGoogle Scholar
  23. 23.
    Manly BF, McDonald L, Thomas DL (2002) Resource selection by animals: statistical design and analysis for field studies. Kluwer, BostonGoogle Scholar
  24. 24.
    Pearce JL, Boyce MS (2006) Modelling distribution and abundance with presence-only data. J Appl Ecol 43:405–412CrossRefGoogle Scholar
  25. 25.
    Beyer HL (2008) Hawth’s analysis tools for ArcGIS.
  26. 26.
    Engler R, Guisan A, Rechsteiner L (2004) An improved approach for predicting the distribution of rare and endangered species from occurrence and pseudo-absence data. J Appl Ecol 41:263–274CrossRefGoogle Scholar
  27. 27.
    Craig E, Huettmann F (2008) Using “Blackbox” algorithms such as TreeNet and random forests for data-mining and for finding meaningful patterns, relationships, and outliers in complex ecological data: an overview, an example using golden eagle satellite data and an outlook for a promising future. IGI Global, HersheyGoogle Scholar
  28. 28.
    Booms TL, Huettmann F, Schempf PF (2009) Gyrfalcon nest distribution in Alaska based on a predictive GIS model. Polar Biol 33:347–358CrossRefGoogle Scholar
  29. 29.
    Araújo MB, Williams PH (2000) Selecting areas for species persistence using occurrence data. Biol Conserv 96:331–345CrossRefGoogle Scholar
  30. 30.
    Keating KA, Cherry S (2004) Use and interpretation of logistic regression in habitat selection studies. J Wildl Manag 68:774–789CrossRefGoogle Scholar
  31. 31.
    Mukkamala S, Sung A, Ribeiro B et al (2006) Model selection and feature ranking for financial distress classification. In: International symposium on neural networks forumGoogle Scholar
  32. 32.
    Huettmann F, Diamond A (2006) Large-scale effects on the spatial distribution of seabirds in the Northwest Atlantic. Landsc Ecol 21:1089–1108CrossRefGoogle Scholar
  33. 33.
    Ohse B, Huettmann F, Ickert-Bond SM et al (2009) Modeling the distribution of white spruce (Picea glauca) for Alaska with high accuracy: an open access role-model for predicting tree species in last remaining wilderness areas. Polar Biol 32:1717–1729CrossRefGoogle Scholar
  34. 34.
    Elith J, Graham CH, Ferrier S et al (2006) Novel methods improve prediction of species’ distributions from occurrence data. Ecography 29:129–151CrossRefGoogle Scholar
  35. 35.
    Alonso JC, Alonso JA (1990) Parámetros Demográficos, Selección de Hábitat y Distribución de La Avutarda (Otis tarda) en Tres Regiones Españolas: ICONA, Madrid, SpainGoogle Scholar
  36. 36.
    Onrubia A, Saenz de Buruaga M, Osborne P et al (1998) Viabilidad de la Poblacion Navarra de Avutardas. Consultora de Recursos Naturales, Vitoria, SpainGoogle Scholar
  37. 37.
    Osborne PE, Alonso J, Bryant R (2001) Modelling landscape-scale habitat use using GIS and remote sensing: a case study with great bustards. J Appl Ecol 38:458–471CrossRefGoogle Scholar
  38. 38.
    Hastie T, Tibshirani R, Friedman J (2001) Elements of statistical learning: data mining, inference and prediction. Springer, New YorkCrossRefGoogle Scholar
  39. 39.
    Breiman L (1996) Bagging predictors. Mach Learn 26:123–140Google Scholar

Copyright information

© Science China Press and Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.College of Nature ConservationBeijing Forestry UniversityBeijingChina
  2. 2.EWHALE Laboratory, Department of Biology and Wildlife, Institute of Arctic BiologyUniversity of Alaska Fairbanks BiologyFairbanksUSA

Personalised recommendations