A micro-control device of soundscape collection for mixed frog call recognition

  • Chih-Cheng Chiu
  • Tung-Kuan Liu
  • Wen-Ping Chen
  • Wen-Chih Lin
  • Jyh-Horng Chou
Technical Paper


An Ecological Forest Park is a place that combines leisure and research, but the balance of the local ecology can be affected if the number of tourists exceeds the quota allowed for the park. Ecologists have utilized wild soundscapes in the most common surveys of frog ecology. However, soundscapes for a wild field are highly complex when recorded into a single channel from multiple sources since it contains various types of background voices and an unknown number of mixed sources. Blind source separation is ineffective in later processing of voiceprint recognition algorithms. This paper uses a micro server for automatic directional control of the microphone facing the animal source. This device also uses an interference tube to eliminate the noise outside from the directional microphone to predict the number of mixed sources that are used for the blind source separation by the cluster of frogs and discrepancy in the croaking gap. In the end, adaptive multi-stages average spectrum (AMSAS) is used to separate the animal sources, and the experiment makes use of the recorded files including the monosyllables of six types of frogs and mixed ones with seven kinds from the Shan-PING Forest Ecological Garden. Meanwhile, we compare the recognition rates among the processing using dynamic time warping, multi-stage average spectrum, and AMSAS to verify the superiority of the proposed method.



The authors would like to thank the Ministry of Science and Technology (MOST) of Taiwan for supporting this research under Project number MOST 106-2622-E-151-018-CC3.


  1. Bonaroya L, Bimbot F (2003) Wiener based source separation with HMM/GMM using a single sensor HMM/GMM using a single sensor. In: International symposium on independent component analysis and blind signal separation, pp 957–961Google Scholar
  2. Chen WP, Chen SS, Lin CC, Chen YZ, Lin WC (2012) Automatic recognition of frog call using multi-stage average spectrum. Comput Math Appl 64(5):1270–1281CrossRefGoogle Scholar
  3. Chesmore E (2001) Application of time domain signal coding and artificial neural networks to passive acoustic identification of animals. Appl Acoust 62:1359–1374CrossRefGoogle Scholar
  4. Davis SB, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process 28(4):357–366CrossRefGoogle Scholar
  5. Hattay J, Belaid S, Naanaa W (2015) Non-negative matrix factorisation for blind source separation in wavelet transform domain. IET Signal Proc 9(2):111–119CrossRefGoogle Scholar
  6. Hsieh SC, Chen WP, Lin WC, Chou FS, Lai JR (2012) Endpoint detection of frog croak syllables with using average energy entropy method. Taiwan J For Sci 27(2):149–161Google Scholar
  7. Huang CJ, Yang YJ, Yang DX, Chen YJ, Wei HY (2008) Realization of an intelligent frog call identification agent. In: Proceedings of the second KES international conference on agent and multi-agent systems: technologies and applications, vol 4953, pp 93–102Google Scholar
  8. Huang CJ, Yang YJ, Yang DX, Chen YJ (2009) Frog classification using machine learning techniques. Expert Syst Appl 36(2):3737–3743CrossRefGoogle Scholar
  9. Hyvärinen A (1999) Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans Neural Netw 10(3):626–634CrossRefGoogle Scholar
  10. Hyvärinen A, Oja E (2000) Independent component analysis: algorithm and application. Neural Netw 13(4–5):411–430CrossRefGoogle Scholar
  11. Jang GJ, Lee TW (2003) A maximum likelihood approach to single channel source separation. J Mach Learn Res 4:1365–1392MathSciNetMATHGoogle Scholar
  12. King RA, Gosling W (1978) Time-encoded speech. Electron Lett 14(15):222–226CrossRefGoogle Scholar
  13. Kırbız S, Gunsel B (2012) Perceptually weighted non-negative matrix factorization for blind single-channel music source separation. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), pp 226–229Google Scholar
  14. Kogan JA, Margoliash D (1998) Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden markov models: a comparative study. J Acoust Soc Am 103(4):2185–2196CrossRefGoogle Scholar
  15. Komori T, Katagiri S (1992) Application of a generalized probabilistic descent method to dynamic time warping-based speech recognition. In: IEEE international conference on acoustics, speech, and signal processing, pp 497–500Google Scholar
  16. Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791CrossRefMATHGoogle Scholar
  17. Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst 13:556–562Google Scholar
  18. Lee CH, Chou CH, Han CC, Hunag RZ (2006) Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis. Pattern Recogn Lett 27(2):93–101CrossRefGoogle Scholar
  19. Li CL, Hui KC (2000) Feature recognition by template matching. Comput Gr 24(4):569–582CrossRefGoogle Scholar
  20. Meganem I, Deville Y, Hosseini S, Déliot P, Briottet X (2014) Linear-quadratic blind source separation using NMF to unmix urban hyperspectral images. IEEE Trans Signal Process 62(7):1822–1833MathSciNetCrossRefGoogle Scholar
  21. Mijović B, Vos MD, Gligorijević I, Taelman J, Huffel SV (2010) Source separation from single-channel recordings by combining empirical-mode decomposition and independent component analysis. In: IEEE transactions on biomedical engineering, vol 57, no 9Google Scholar
  22. Myers C, Rabiner LR (1980) Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Trans Acoust Speech Signal Process 28(6):623–635CrossRefMATHGoogle Scholar
  23. Noda JJ, Travieso CM, Sánchez-Rodríguez D, Dutta MK, Singh A (2016) Using bioacoustic signals and support vector machine for automatic classification of insects. In: 2016 3rd international conference on signal processing and integrated networks (SPIN), pp 656–659Google Scholar
  24. Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2):111–126CrossRefGoogle Scholar
  25. Rapin J, Bobin J, Larue A, Starck JL (2013) Sparse and non-negative BSS for noisy data. IEEE Trans Signal Process 61(22):5620–5632MathSciNetCrossRefGoogle Scholar
  26. Schmidt MN, Mørup M (2006) Nonnegative matrix factor 2-D deconvolution for blind single channel source separation. In: Proceedings of international conferences independent component analysis and blind signal separation, vol 3889, pp 700–707Google Scholar
  27. Shyu Kuo-Kai, Lee Ming-Huan, Yu-Te Wu, Lee Po-Lei (2008) Implementation of pipelined fastICA on FPGA for real-time blind source separation. IEEE Trans Neural Netw 19(6):958–970CrossRefGoogle Scholar
  28. Somervuo P, Harma A, Fagerlund S (2006) Parametric representations of bird sounds for automatic species recognition. IEEE Trans Audio Speech Lang Process 14(6):2252–2263CrossRefGoogle Scholar
  29. Suzuki M, Honjo T (2015) Spot-forming method by using two shotgun microphones. In: 2015 Asia-Pacific signal and information processing association annual summit and conference (APSIPA), pp 188–191Google Scholar
  30. Taylor A, Grigg G, Watson G, McCallum H (1996) Monitoring frog communities: an application of machine learning. In: Proceedings of eighth innovative applications of artificial intelligence conference, pp 1564–1569Google Scholar
  31. Tyagi H, Hegde RM, Murthy HA, Prabhaka A (2006) Automatic identification of bird calls using spectral ensemble average voice prints. In: Proceedings of the thirteenth european signal processing conferenceGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Program in Engineering Science and Technology, College of EngineeringNational Kaohsiung First University of Science and TechnologyKaohsiungTaiwan, ROC
  2. 2.Department of Mechanical and Automation EngineeringNational Kaohsiung First University of Science and TechnologyKaohsiungTaiwan, ROC
  3. 3.Department of Electrical EngineeringNational Kaohsiung University of Applied SciencesKaohsiungTaiwan, ROC
  4. 4.Graduate Institute of Clinical MedicineKaohsiung Medical UniversityKaohsiungTaiwan, ROC
  5. 5.Liouguei Research CenterTaiwan Forestry Research InstituteKaohsiungTaiwan, ROC

Personalised recommendations