Skip to main content
Log in

Multispecies bird sound recognition using a fully convolutional neural network

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This study proposes a method based on fully convolutional neural networks (FCNs) to identify migratory birds from their songs, with the objective of recognizing which birds pass through certain areas and at what time. To determine the best FCN architecture, extensive experimentation was conducted through a grid search, exploring the optimal depth, width, and activation function of the network. The results showed that the optimal number of filters is 400 in the widest layer, with 4 convolutional blocks with maxpooling and an adaptive activation function. The proposed FCN offers a significant advantage over other techniques, as it can recognize the sound of a bird in audio of any length with an accuracy greater than 85%. Furthermore, due to its architecture, the network can detect more than one species from audio and can carry out near-real-time sound recognition. Additionally, the proposed method is lightweight, making it ideal for deployment and use in IoT devices. The study also presents a comparative analysis of the proposed method against other techniques, demonstrating an improvement of over 67% in the best-case scenario. These findings contribute to advancing the field of bird sound recognition and provide valuable insights into the practical application of FCNs in real-world scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data Availabity

The code and data are available from the corresponding author on reasonable request.

Notes

  1. https://www.xeno-canto.org/

References

  1. Lin HY, Schuster R, Wilson S, Cooke SJ, Rodewald AD, Bennett JR (2020) Integrating season-specific needs of migratory and resident birds in conservation planning. Biol Conserv 252:108826. https://doi.org/10.1016/j.biocon.2020.108826

    Article  Google Scholar 

  2. Both C, Bouwhuis S, Lessells CM, Visser ME (2006) Climate change and population declines in a long-distance migratory bird. Nature 441(1):81–83. https://doi.org/10.1038/nature04539

    Article  Google Scholar 

  3. BirdLife (2022) State of the world’s birds 2022 - birdlife international. https://www.birdlife.org/papers-reports/state-of-the-worlds-birds-2022/

  4. Both C, Visser ME (2001) Adjustment to climate change is constrained by arrival date in a long-distance migrant bird. Nature 411:296–298. https://doi.org/10.1038/35077063

    Article  Google Scholar 

  5. Butler CJ (2003) The disproportionate effect of global warming on the arrival dates of short-distance migratory birds in North America. Ibis 145(3):484–495. https://doi.org/10.1046/J.1474-919X.2003.00193.X

    Article  Google Scholar 

  6. Barbraud C, Weimerskirch H (2006) Antarctic birds breed later in response to climate change. Proc Natl Acad Sci U S A 103(16):6248–6251. https://doi.org/10.1073/PNAS.0510397103/ASSET/BA813988-8771-45A2-8C49-E4C7118FE2D6/ASSETS/GRAPHIC/ZPQ0150618960004.JPEG

    Article  Google Scholar 

  7. Sharma S, Kreye MM (2022) Social value of bird conservation on private forest lands in Pennsylvania, USA. Ecol Econ 196:107426. https://doi.org/10.1016/J.ECOLECON.2022.107426

    Article  Google Scholar 

  8. Flack A, Aikens EO, Kölzsch A, Nourani E, Snell KR, Fiedler W et al (2022) New frontiers in bird migration research. Curr Biol 32(20):R1187–R1199. https://doi.org/10.1016/J.CUB.2022.08.028

    Article  Google Scholar 

  9. Farnsworth A, Gauthreaux SA, Van Blaricom D (2004) A comparison of nocturnal call counts of migrating birds and reflectivity measurements on Doppler radar. J Avian Biol 35(4):365–369. https://doi.org/10.1111/j.0908-8857.2004.03180.x

    Article  Google Scholar 

  10. Kahl S, Wood CM, Eibl M, Klinck H (2021) BirdNET: A deep learning solution for avian diversity monitoring. Ecol Inf 61:101236. https://doi.org/10.1016/j.ecoinf.2021.101236

    Article  Google Scholar 

  11. Hanguang X, Daidai L, Kai C, Mi Z (2022) AMResNet: An automatic recognition model of bird sounds in real environment. Appl Acoust 201. https://doi.org/10.1016/j.apacoust.2022.109121

  12. Tuncer T, Akbal E, Dogan S (2021) Multileveled ternary pattern and iterative ReliefF based bird sound classification. Appl Acoust 176:107866. https://doi.org/10.1016/j.apacoust.2020.107866

    Article  Google Scholar 

  13. Hsu SB, Lee CH, Chang PC, Han CC, Fan KC (2018) Local Wavelet Acoustic Pattern: A Novel Time-Frequency Descriptor for Birdsong Recognition. IEEE Trans Multimedia 20(12):3187–3199. https://doi.org/10.1109/TMM.2018.2834866

    Article  Google Scholar 

  14. Briggs F, Lakshminarayanan B, Neal L, Fern XZ, Raich R, Hadley SJK et al (2012) Acoustic classification of multiple simultaneous bird species: A multi-instance multi-label approach. J Acoust Soc Am 131(6):4640–4650. https://doi.org/10.1121/1.4707424

    Article  Google Scholar 

  15. Keenan TD, Chen Q, Agrón E, Tham YC, Goh JHL, Lei X et al (2022) DeepLensNet: Deep Learning Automated Diagnosis and Quantitative Classification of Cataract Type and Severity. Ophthalmology 129(5):571–584. https://doi.org/10.1016/J.OPHTHA.2021.12.017

    Article  Google Scholar 

  16. Tsuneki M (2022) Deep learning models in medical image analysis. J Oral Biosci. https://doi.org/10.1016/J.JOB.2022.03.003

  17. García-Ordás MT, Benavides C, Benítez-Andrades JA, Alaiz-Moretón H, García-Rodríguez I (2021) Diabetes detection using deep learning techniques with oversampling and feature augmentation. Comput Methods Prog Biomed 202:105968. https://doi.org/10.1016/J.CMPB.2021.105968

    Article  Google Scholar 

  18. Yang C, Li D, Sun D, Zhang S, Zhang P, Xiong Y et al (2022) A deep learning-based system for assessment of serum quality using sample images. Clin Chim Acta 531:254–260. https://doi.org/10.1016/J.CCA.2022.04.010

    Article  Google Scholar 

  19. Liu J, Luo H, Liu H (2022) Deep learning-based data analytics for safety in construction. Autom Constr 140:104302. https://doi.org/10.1016/J.AUTCON.2022.104302

    Article  Google Scholar 

  20. Lin K, Zhao Y, Kuo JH, Deng H, Cui F, Zhang Z et al (2022) Toward smarter management and recovery of municipal solid waste: A critical review on deep learning approaches. J Clean Prod 346:130943. https://doi.org/10.1016/J.JCLEPRO.2022.130943

    Article  Google Scholar 

  21. Florentin J, Dutoit T, Verlinden O (2020) Detection and identification of european woodpeckers with deep convolutional neural networks. Ecol Inf 55:101023. https://doi.org/10.1016/j.ecoinf.2019.101023

    Article  Google Scholar 

  22. Ruff ZJ, Lesmeister DB, Appel CL, Sullivan CM (2021) Workflow and convolutional neural network for automated identification of animal sounds. Ecol Indic 124. https://doi.org/10.1016/j.ecolind.2021.107419

  23. Zachary JR, Damon BL, Leila SD, Bharath KP, Christopher MS (2019) Automated identification of avian vocalizations with deep convolutional neural networks. Remote sensing in Ecology and Conservation 6. https://doi.org/10.1002/rse2.125

  24. Zhang X, Chen A, Zhou G, Zhang Z, Huang X, Qiang X (2019) Spectrogram-frame linear network and continuous frame sequence for bird sound classification. Ecol Inf 54:101009. https://doi.org/10.1016/j.ecoinf.2019.101009

    Article  Google Scholar 

  25. Xie J, Zhu M (2019) Handcrafted features and late fusion with deep learning for bird sound classification. Ecol Inf 52:74–81. https://doi.org/10.1016/j.ecoinf.2019.05.007

    Article  Google Scholar 

  26. Kücüktopcu O, Masazade E, Ünsalan C, Varshney PK (2019) A real-time bird sound recognition system using a low-cost microcontroller. Appl Acoust 148:194–201. https://doi.org/10.1016/j.apacoust.2018.12.028

    Article  Google Scholar 

  27. García-Ordás MT, Alaiz-Moretón H, Benítez-Andrades JA, García-Rodríguez I, García-Olalla O, Benavides C (2021) Sentiment analysis in non-fixed length audios using a Fully Convolutional Neural Network. Biomed Signal Process Control 69:102946. https://doi.org/10.1016/J.BSPC.2021.102946

    Article  Google Scholar 

  28. Wang Y, Zhao G, Xiong K, Shi G, Zhang Y (2021) Multi-Scale and Single-Scale Fully Convolutional Networks for Sound Event Detection. Neurocomputing 421:51–65. https://doi.org/10.1016/J.NEUCOM.2020.09.038

    Article  Google Scholar 

  29. Shahin AI, Aly W, Aly S (2023) MBTFCN: A novel modular fully convolutional network for MRI brain tumor multi-classification. Expert Syst Appl 212:118776. https://doi.org/10.1016/J.ESWA.2022.118776

    Article  Google Scholar 

  30. Li H, Fan J, Hua Q, Li X, Wen Z, Yang M (2022) Biomedical sensor image segmentation algorithm based on improved fully convolutional network. Measurement 197:111307. https://doi.org/10.1016/J.MEASUREMENT.2022.111307

    Article  Google Scholar 

  31. Yuan J, Jiao Z (2022) Faulty feeder detection based on fully convolutional network and fault trust degree estimation in distribution networks. Int J Electr Power Energy Syst 141:108264. https://doi.org/10.1016/J.IJEPES.2022.108264

    Article  Google Scholar 

  32. Guo Y, Cui H, Li S (2022) Excavator joint node-based pose estimation using lightweight fully convolutional network. Automation in Construction 141:104435. https://doi.org/10.1016/J.AUTCON.2022.104435

  33. Steven B Davis PM 1982) Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences. Tech. rep

  34. McFee B, Raffel C, Liang D, Ellis DP, McVicar M, Battenberg E, Nieto O (2015) librosa: Audio and music signal analysis in python. In: Proceedings of the 14th python in science conference, vol 8

  35. Watkinson J (2001) The Art of Digital Audio. Focal Press. https://books.google.es/books?id=eVpITJfPxMEC

  36. Jagtap AD, Kawaguchi K, Em Karniadakis G (2020) Locally adaptive activation functions with slope recovery for deep and physics-informed neural networks. Proc R Soc A Math Phys Eng Sci 476(2239):20200334. https://doi.org/10.1098/rspa.2020.0334

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

María Teresa García-Ordás: Conceptualization, Data curation, Methodology, Software, Visualization, Validation, Writing- Original draft preparation. Sergio Rubio-Martín: Data curation, Writing- Original draft preparation.Héctor Alaiz-Moretón: Conceptualization, Supervision, Writing- Reviewing and Editing. Isaías García-Rodríguez: Conceptualization, Supervision, Writing- Reviewing and Editing. José Alberto Benítez-Andrades: Data curation, Methodology, Software, Visualization, Validation, Writing- Reviewing and Editing.

Corresponding author

Correspondence to José Alberto Benítez-Andrades.

Ethics declarations

Competing interests

The authors declare that they have no conflicts of interest regarding this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

García-Ordás, M.T., Rubio-Martín, S., Benítez-Andrades, J.A. et al. Multispecies bird sound recognition using a fully convolutional neural network. Appl Intell 53, 23287–23300 (2023). https://doi.org/10.1007/s10489-023-04704-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04704-3

Keywords

Navigation