Bird Sound Detection Based on Binarized Convolutional Neural Networks

  • Jianan SongEmail author
  • Shengchen Li
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 568)


Bird Sound Detection (BSD) is helpful for monitoring biodiversity and in this regard, deep learning networks have shown good performance in BSD in recent years. However, such a complex network structure requires high memory resources and computing power at great cost for performing the extensive calculations required, which make it difficult to implement the hardware in BSD. Therefore, we designed an audio classification method for BSD using a Binarized Convolutional Neural Network (BCNN). The convolutional layers and fully connected layers of the original Convolutional Neural Network were binarized to two values. The Area Under ROC Curve (AUC) score of BCNN achieved comparable results with the CNN in an unseen evaluation. This paper proposes two networks (CNNs and BCNNs) for the BSD task of the IEEE AASP Challenge on the Detection and Classification of Acoustic Scenes and Events (DCASE2018). The Area Under ROC Curve (AUC) score of BCNN achieved comparable results with CNN on the unseen evaluation data. More importantly, the use of the BCNN could reduce the memory requirement and the hardware loss unit, which are of great significance to the hardware implementation of a bird sound detection system.


Bird sound detection Convolutional neural networks Binarized neural network 



This work is partially supported by Youth Innovation Projects of Beijing University of Posts and Telecommunications (2017RC16): The method of evaluating the performance of FPGA computation platform for deep learning systems.


  1. 1.
    Emre C, Sharath A, Giamabattista P, Konstantinos D, Tuomas V (2017) Convolutional recurrent neural networks for bird audio detection. 25th Europe Signal Process (EUSIPCO) 2017. pp 1729–1733Google Scholar
  2. 2.
    Tiago AM, Len T, Martin S, David M (2013) Estimating animal population density using passive acoustics. Biol Rev 88(2):287–309Google Scholar
  3. 3.
    Abraham LB, Matthew WM, Joshua TA, Collin AE, Bernie RT, Donald AC (2014) Vocal activity as a low cost and scalable index of seabird colony size. Conserv Biol 28(4):1100–1108Google Scholar
  4. 4.
    Dan S, Mike W, Yannis S, Herve G (2016) Bird detection in audio: a survey and a challenge. In: International workshop on machine learning for signal processing (MLSP). pp 13–16Google Scholar
  5. 5.
    Yasutaka N, Masahiro S, Taisuke N, Norihito S, Toshiya O, Nobutaka O (2016) DNN-based environmental sound recognition with real-recorded and artificially-mixed training data. Inter-Noise 3164–3173Google Scholar
  6. 6.
    Chao L, Zhiyong Z, Dong W (2014) Pruning neural networks by optimal brain damage. Int Speech Commun Assoc (INTERSPEECH) 1092–1095Google Scholar
  7. 7.
    Om PP, Aruna T (2015) Advance quantum based binary neural network learning algorithm. Network Parallel/Distrib Comput (SNPD) 1–6Google Scholar
  8. 8.
    Dan S, Yannis S, Mike W, Hanna P (2018) Automatic acoustic detection of birds through deep learning. The first bird audio detection challenge. Meth Ecol Evol 15:1–21Google Scholar
  9. 9.
    Mohammad R, Vicente O, Joseph R, Ali F(2016) XNOR-Net: ImageNet classification using binary convolutional neural networks. Europ Conf Comput Vision (ECCV) 1–17Google Scholar
  10. 10.
    Matthieu C, Yoshua B, Jean-Pierre D (2015) BinaryConnect: training deep neural networks with binary weights during propagations. Neural Inform Process Syst (NIPS) 2:3123–3131Google Scholar
  11. 11.
    Minje K, Paris S (2015) Bitwise neural networks. Int Conf Mach Learn (ICML) 37Google Scholar
  12. 12.
    Plagianakos VP, Vrahatis MN (1999) Training neural networks with 3-bit integer weights. Genet Evol Comput Conf (GECCO) 1:910–915Google Scholar
  13. 13.
    Emre C, Toni H, Heikki H, Tuomas V (2015) Polyphonic sound event detection using multi label deep neural networks. Int Joint Conf Neural Network (IJCNN) 1–7Google Scholar
  14. 14.
    Adam C, Brody H, Tao W, David JW, Andrew YN, Bryan C (2013) Deep learning with COTS HPC systems. Int Conf Machine Learn (ICML) 28:1337–1345Google Scholar
  15. 15.
    Song H, Jeff P, John T, William JD (2015) Learning both weights and connections for efficient neural networks. Neural Inform Process Syst (NIPS) 1:1135–1143Google Scholar
  16. 16.
    Steve KE, Rathinakumar A, Paul AM, John VA, Dharmendra SM (2015) Backpropagation for energy-efficient neuromorphic computing. Neur Inform Process Syst (NIPS) 1:1117–1125Google Scholar
  17. 17.
    Elizabeth PD (2009) Ecology shapes birdsong evolution: variation in morphology and habitat explains variation in WhiteCrowned Sparrow song. Am Natural 74(1):24–33Google Scholar
  18. 18.
    Matthieu C, Jean-Pierre D, Yoshua B (2015) Training deep neural networks with low precision multiplications. Int Conf Learn Represent (ICLR)Google Scholar
  19. 19.
    Huth J, Timothee M, Angelo A (2018) Convis: a toolbox to fit and simulate filter-based models of early visual processing. Front Neuroinfor. Last Assessed 7 Mar 2018

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Beijing University of Posts and TelecommunicationsBeijingChina

Personalised recommendations