Bird Species Classification Using Transfer Learning with Multistage Training

  • Akash KumarEmail author
  • Sourya Dipta DasEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1019)


Bird species classification has received more and more attention in the field of computer vision, for its promising applications in biology and environmental studies. Recognizing bird species are difficult due to the challenges of discriminative region localization and fine-grained feature learning. In this paper, we have introduced a Transfer learning based method with multistage training. We have used both Pre-Trained Mask-RCNN and a ensemble model consists of Inception Nets (InceptionV3 net & InceptionResnetV2) to get both the localization and species of the bird from the images. we have tested our model in an Indian bird dataset consist of variable size, high-resolution images are taken from camera in various environments (like day, noon, evening etc.) with different perspectives and occlusions. Our final model achieves an F1 score of 0.5567 or 55.67% on that dataset.

Code is available at: Implemented in Keras [20].


Bird species classification Deep networks Transfer learning Multistage training Object detection 


  1. 1.
    Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, pp. 4476–4484 (2017).
  2. 2.
    Pang, C., Yao, H., Sun, X.: Discriminative features for bird species classification. In: Proceedings of International Conference on Internet Multimedia Computing and Service, ICIMCS 2014, p. 256, 5 p. ACM, New York (2014).
  3. 3.
    Ge, Z., McCool, C., Sanderson, C., Bewley, A., Chen, Z., Corke, P.: Fine-grained bird species recognition via hierarchical subset learning. In: 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, pp. 561–565 (2015).
  4. 4.
    Atanbori, J., Duan, W., Murray, J., Appiah, K., Dickinson, P.: Automatic classification of flying bird species using computer vision techniques. Pattern Recogn. Lett. 81, 53–62 (2016). Scholar
  5. 5.
    Marini, A., Facon, J., Koerich, A.: Bird species classification based on color features. In: Proceedings - 2013 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2013, pp. 4336–4341 (2013).
  6. 6.
    Branson, S., et al.: Bird species categorization using pose normalized deep convolutional Nets. CoRR abs/1406.2952 (2014)Google Scholar
  7. 7.
    Yang, S., et al.: Unsupervised template learning for fine-grained object recognition. In: NIPS (2012)Google Scholar
  8. 8.
    He, K., et al.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)Google Scholar
  9. 9.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
  10. 10.
    Szegedy, C., et al.: Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: AAAI (2017)Google Scholar
  11. 11.
    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z: Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, United States, pp. 2818–2826 (2016).
  12. 12.
    Ramachandran, P., Zoph, B., Le, Q.V.: Swish: a self-gated activation function (2017)Google Scholar
  13. 13.
    Vinyals, O., Blundell, C., Lillicrap, T.P., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: NIPS (2016)Google Scholar
  14. 14.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). Scholar
  15. 15.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR, abs/1512.03385 (2015)Google Scholar
  16. 16.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014)Google Scholar
  17. 17.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, NIPS 2015, Montreal, Canada (2015)Google Scholar
  18. 18.
    Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 640–651 (2017)CrossRefGoogle Scholar
  19. 19.
    Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 dataset. Computation & Neural Systems. Technical Report, CNS-TR-2011-001 (2011)Google Scholar
  20. 20.
    Chollet, F., et al.: Keras (2015).

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Delhi Technological UniversityDelhiIndia
  2. 2.Jadavpur UniversityKolkataIndia

Personalised recommendations