Abstract
Camera trap image analysis, although critical for habitat and species conservation, is often a manual, time-consuming, and expensive task. Thus, automating this process would allow large-scale research on biodiversity hotspots of large conspicuous mammals and bird species. This paper explores the use of deep learning species-level object detection and classification models for this task, using two state-of-the-art architectures, YOLOv5 and Faster R-CNN, for two species: white-lipped peccary and collared peccary. The dataset contains 7,733 images obtained after data augmentation from the Tiputini Biodiversity Station. The models were trained in 70% of the dataset, validated in 20%, and tested in 10% of the available data. The Faster R-CNN model achieved an average mAP (Mean Average Precision) of 0.26 at a 0.5 Intersection Over Union (IoU) threshold and 0.114 at a 0.5 to 0.95 IoU threshold, which is comparable with the original results of Faster R-CNN on the MS COCO dataset. Whereas, YOLOv5 achieved an average mAP of 0.5525 at a 0.5 IoU threshold, while its average mAP at a 0.5 to 0.95 IoU threshold is 0.37997. Therefore, the YOLOv5 model was shown to be more robust, having lower losses and a higher overall mAP value than Faster-RCNN and YOLOv5 trained on the MS COCO dataset. This is one of the first steps towards developing an automated camera trap analysis tool, allowing a large-scale analysis of population and habitat trends to benefit their conservation. The results suggest that hyperparameter fine-tuning would improve our models and allow us to extend this tool to other native species.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aziz, L., Salam, M.S.B.H., Sheikh, U.U., Ayub, S.: Exploring deep learning-based architecture, strategies, applications and current trends in generic object detection: a comprehensive review. IEEE Access 8, 170461–170495 (2020)
Bass, M.S., et al.: Global conservation significance of ecuador’s yasuní national park. PLoS ONE 5(1), e8767 (2010)
Beery, S., Morris, D., Yang, S.: Efficient pipeline for camera trap image review. arXiv preprint arXiv:1907.06772 (2019)
Blake, J.G., Mosquera, D., Guerra, J., Loiselle, B.A., Romo, D., Swing, K.: Yasuní-a hotspot for jaguars panthera onca (carnivora: Felidae)? camera-traps and jaguar activity at tiputini biodiversity station, ecuador. Rev. Biol. Trop. 62(2), 689–698 (2014)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Bodla, N., Singh, B., Chellappa, R., Davis, L.: Improving object detection with one line of code. arxiv 2017. arXiv preprint arXiv:1704.04503 (2017)
Cao, C., et al.: An improved faster R-CNN for small object detection. IEEE Access 7, 106838–106846 (2019)
Carl, C., Schönfeld, F., Profft, I., Klamm, A., Landgraf, D.: Automated detection of European wild mammal species in camera trap images with an existing and pre-trained computer vision model. Eur. J. Wildl. Res. 66(4), 1–7 (2020)
Cheema, G.S., Anand, S.: Automatic detection and recognition of individuals in patterned species. In: Altun, Y., et al. (eds.) ECML PKDD 2017. LNCS (LNAI), vol. 10536, pp. 27–38. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71273-4_3
Choiński, M., Rogowski, M., Tynecki, P., Kuijper, D.P.J., Churski, M., Bubnicki, J.W.: A first step towards automated species recognition from camera trap images of mammals using AI in a European temperate forest. In: Saeed, K., Dvorský, J. (eds.) CISIM 2021. LNCS, vol. 12883, pp. 299–310. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-84340-3_24
Deng, Z., Sun, H., Zhou, S., Zhao, J., Lei, L., Zou, H.: Multi-scale object detection in remote sensing imagery with convolutional neural networks. ISPRS J. Photogramm. Remote. Sens. 145, 3–22 (2018)
Espinosa, S., Celis, G., Branch, L.C.: When roads appear jaguars decline: increased access to an Amazonian wilderness area reduces potential for jaguar conservation. PLoS ONE 13(1), e0189740 (2018)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
He, Y., Weng, Q.: High Spatial Resolution Remote Sensing: Data, Analysis, and Applications. CRC Press, Boca Raton (2018)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)
Jocher, G., et al.: ultralytics/yolov5: v6.1 - TensorRT, TensorFlow Edge TPU and OpenVINO Export and Inference (2022). https://doi.org/10.5281/zenodo.6222936
Li, Z., Tian, X., Liu, X., Liu, Y., Shi, X.: A two-stage industrial defect detection framework based on improved-yolov5 and optimized-inception-resnetv2 models. Appl. Sci. 12(2), 834 (2022)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, S., Huang, D., Wang, Y.: Adaptive NMS: refining pedestrian detection in a crowd. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6459–6468 (2019)
Peng, J., et al.: Wild animal survey using UAS imagery and deep learning: modified faster R-CNN for kiang detection in Tibetan plateau. ISPRS J. Photogramm. Remote. Sens. 169, 364–376 (2020)
Rahman, M.A., Wang, Y.: Optimizing intersection-over-union in deep neural networks for image segmentation. In: Bebis, G., et al. (eds.) ISVC 2016. LNCS, vol. 10072, pp. 234–244. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50835-1_22
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)
Norouzzadeh, M.S., et al.: Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proc. Natl. Acad. Sci. 115(25), E5716–E5725 (2018)
Shim, K., Barczak, A., Reyes, N., Ahmed, N.: Small mammals and bird detection using IoT devices. In: 2021 36th International Conference on Image and Vision Computing New Zealand (IVCNZ), pp. 1–6. IEEE (2021)
Smith, L.N.: A disciplined approach to neural network hyper-parameters: part 1-learning rate, batch size, momentum, and weight decay. arXiv preprint arXiv:1803.09820 (2018)
Srivastava, S., Divekar, A.V., Anilkumar, C., Naik, I., Kulkarni, V., Pattabiraman, V.: Comparative analysis of deep learning image detection algorithms. J. Big Data 8(1), 1–27 (2021). https://doi.org/10.1186/s40537-021-00434-w
Suárez, E., Zapata-Ríos, G., Utreras, V., Strindberg, S., Vargas, J.: Controlling access to oil roads protects forest cover, but not wildlife communities: a case study from the rainforest of Yasuní Biosphere Reserve (Ecuador). Anim. Conserv. 16(3), 265–274 (2013)
Tang, L., Li, F., Lan, R., Luo, X.: A small object detection algorithm based on improved faster RCNN. In: International Symposium on Artificial Intelligence and Robotics 2021, vol. 11884, pp. 653–661. SPIE (2021)
Trolliet, F., Vermeulen, C., Huynen, M.C., Hambuckers, A.: Use of camera traps for wildlife studies: a review. Biotechnol. Agron. Soc. Environ. 18(3), 446–454 (2014)
Vargas-Felipe, M., Pellegrin, L., Guevara-Carrizales, A.A., López-Monroy, A.P., Escalante, H.J., Gonzalez-Fraga, J.A.: Desert bighorn sheep (ovis canadensis) recognition from camera traps based on learned features. Eco. Inform. 64, 101328 (2021)
Wang, X., Xiao, T., Jiang, Y., Shao, S., Sun, J., Shen, C.: Repulsion loss: detecting pedestrians in a crowd. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7774–7783 (2018)
Weckel, M., Giuliano, W., Silver, S.: Jaguar (panthera onca) feeding ecology: distribution of predator and prey through time and space. J. Zool. 270(1), 25–30 (2006)
Yang, Q., Xiao, D., Lin, S.: Feeding behavior recognition for group-housed pigs with the faster R-CNN. Comput. Electron. Agric. 155, 453–460 (2018)
Yang, Z., Sinnott, R., Ke, Q., Bailey, J.: Individual feral cat identification through deep learning. In: 2021 IEEE/ACM 8th International Conference on Big Data Computing, Applications and Technologies (BDCAT 2021), pp. 101–110 (2021)
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 637–653 (2018)
Acknowledgment
The authors express their gratitude to the Tiputini Biodiversity Station for providing the data used in this study, which were collected by all researchers and staff working on the TBS-Camara trap project. The station is affiliated with USFQ. Additionally, the authors extend thanks to the Applied Signal Processing and Machine Learning Research Group at USFQ for supplying the computing infrastructure (NVidia DGX workstation) employed for implementing and executing the developed source code.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zurita, MJ. et al. (2024). On the Use of Deep Learning Models for Automatic Animal Classification of Native Species in the Amazon. In: Orjuela-Cañón, A.D., Lopez, J.A., Arias-Londoño, J.D. (eds) Applications of Computational Intelligence. ColCACI 2023. Communications in Computer and Information Science, vol 1865. Springer, Cham. https://doi.org/10.1007/978-3-031-48415-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-48415-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48414-8
Online ISBN: 978-3-031-48415-5
eBook Packages: Computer ScienceComputer Science (R0)