Species Distribution Modeling via Spatial Bagging of Multiple Conditional Random Fields

  • Danhuai Guo
  • Yuanchun ZhouEmail author
  • Yingqiu Zhu
  • Jianhui Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9643)


Satellite tracking technologies enable scientists to collect data of animal migrations and species habitats on a large scale. Modeling distributions of wild animals is of considerable use. It helps researchers to understand important ecological phenomena such as the spread of bird flu and climate changes. Species distribution modeling has been studied for a long time, however, most existing work provide solutions in a point-wise manner, ignoring the relevance between adjacent habitats, which may reflect an important dependency between nearby places. In this paper, we take the relevance into consideration, and then propose a novel method to model species habitats and predict possible distribution of wild animals by applying the Spatial Bagging of Multiple Conditional Random Fields(SBMCRFs) on remote-sensing data. To access the usability of our method, several experiments are implemented on a real world dataset of migratory birds from Qinghai Lake Reserve. The experiment results show that SBMCRFs outperforms the baselines significantly, and the relevance between nearby places is demonstrated to be an important factor in species distribution modeling.


Species distribution modeling Conditional random fields Ensemble methods 



This work is partly supported by the Natural Science Foundation of China (NSFC) under Grant No. 41371386 and 91224006.


  1. Busby, J.: BIOCLIM a bioclimate analysis and prediction system. Plant Prot. Q., 6 (1991)Google Scholar
  2. Carpenter, G., Gillison, A.N., Winter, J.: DOMAIN: a flexible modelling procedure for mapping potential distributions of plants and animals. Biodivers. Conserv. 2(6), 667–680 (1993)CrossRefGoogle Scholar
  3. Etherington, T.R., Ward, A.I., Smith, G.C., Pietravalle, S., Wilson, G.J.: Using the Mahalanobis distance statistic with unplanned presence only survey data for biogeographical models of species distribution and abundance: a case study of badger setts. J. Biogeogr. 36(5), 845–853 (2009)CrossRefGoogle Scholar
  4. Elith, J., Leathwick, J.R., Hastie, T.: A working guide to boosted regression trees. J. Anim. Ecol. 77(4), 802–813 (2008)CrossRefGoogle Scholar
  5. Pencina, M.J., D’Agostino, R.B., Vasan, R.S.: Evaluating the added predictive ability of a new marker: from area unde the ROC curve to reclassification and beyond. Stat. Med. 27(2), 157–172 (2008)MathSciNetCrossRefGoogle Scholar
  6. Zheng, V.W., Zheng, Y., Xie, X., Yang, Q. Collaborative location, activity recommendations with GPS history data. In: Proceedings of the 19th International Conference on World Wide Web, pp. 1029–1038. ACM (2010)Google Scholar
  7. Li, Z., Han, J., Ji, M., Tang, L.A., Yu, Y., Ding, B., Kays, R.: Movemine: mining moving object data for discovery of animal movement patterns. ACM Trans. Intell. Syst. Technol. (TIST) 2(4), 37 (2011)Google Scholar
  8. Tang, M., Zhou, Y., Li, J., Wang, W., Cui, P., Hou, Y., Yan, B.: Exploring the wild birds migration data for the disease spread study of H5N1: a clustering and association approach. Knowl. Inf. Syst. 27(2), 227–251 (2011)CrossRefGoogle Scholar
  9. Tang, M.J., Zhou, Y.C., Cui, P., Wang, W., Li, J., Zhang, H., Hou, Y.S., Yan, B.P.: Discovery of migration habitats and routes of wild bird species by clustering and association analysis. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds.) ADMA 2009. LNCS, vol. 5678, pp. 288–301. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  10. Tang, M.J., Wang, W., Jiang, Y., Zhou, Y., Li, J., Cui, P., Liu, Y., Yan, B.: Birds bring flues? mining frequent and high weighted cliques from birds migration networks. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5982, pp. 359–369. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  11. Pearson, R.G.: Species distribution modeling for conservation educators and practitioners. Lessons in Conservation (LinC) Developing the capacity to sustain the earth’s diversity, 54 (2007)Google Scholar
  12. Elith, J., Leathwick, J.R.: Species distribution models: ecological explanation and prediction across space and time. Annu. Rev. Ecol. Evol. Syst. 40, 677–697 (2009)CrossRefGoogle Scholar
  13. Caruana, R., Elhawary, M., Munson, A., Riedewald, M., Sorokina, D., Fink, D., Hochachka, W.M., Kelling, S.: Mining citizen science data to predict orevalence of wild bird species. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 909–915. ACM, NewYork (2006)Google Scholar
  14. Kumar, S., Hebert, M.: Discriminative random fields. Int. J. Comput. Vision 68(2), 179–201 (2006)CrossRefGoogle Scholar
  15. Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields,: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML 01, pp. 282–289. CA, USA, Morgan Kaufmann Publishers Inc, San Francisco (2001)Google Scholar
  16. Phillips, S.J., Anderson, R.P., Schapire, R.E.: Maximum entropy modeling of species geographic distributions. Ecol. Model. 190(3), 231–259 (2006)CrossRefGoogle Scholar
  17. Besag, J.: On the statistical analysis of dirty pictures. J. R. Stat. Soc. Ser. B (Methodol.) 48(3), 259–302 (1986)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Danhuai Guo
    • 1
  • Yuanchun Zhou
    • 1
    Email author
  • Yingqiu Zhu
    • 1
  • Jianhui Li
    • 1
  1. 1.Computer Network Information CenterChinese Academy of SciencesBeijingChina

Personalised recommendations