A micro-control device of soundscape collection for mixed frog call recognition
An Ecological Forest Park is a place that combines leisure and research, but the balance of the local ecology can be affected if the number of tourists exceeds the quota allowed for the park. Ecologists have utilized wild soundscapes in the most common surveys of frog ecology. However, soundscapes for a wild field are highly complex when recorded into a single channel from multiple sources since it contains various types of background voices and an unknown number of mixed sources. Blind source separation is ineffective in later processing of voiceprint recognition algorithms. This paper uses a micro server for automatic directional control of the microphone facing the animal source. This device also uses an interference tube to eliminate the noise outside from the directional microphone to predict the number of mixed sources that are used for the blind source separation by the cluster of frogs and discrepancy in the croaking gap. In the end, adaptive multi-stages average spectrum (AMSAS) is used to separate the animal sources, and the experiment makes use of the recorded files including the monosyllables of six types of frogs and mixed ones with seven kinds from the Shan-PING Forest Ecological Garden. Meanwhile, we compare the recognition rates among the processing using dynamic time warping, multi-stage average spectrum, and AMSAS to verify the superiority of the proposed method.
The authors would like to thank the Ministry of Science and Technology (MOST) of Taiwan for supporting this research under Project number MOST 106-2622-E-151-018-CC3.
- Bonaroya L, Bimbot F (2003) Wiener based source separation with HMM/GMM using a single sensor HMM/GMM using a single sensor. In: International symposium on independent component analysis and blind signal separation, pp 957–961Google Scholar
- Hsieh SC, Chen WP, Lin WC, Chou FS, Lai JR (2012) Endpoint detection of frog croak syllables with using average energy entropy method. Taiwan J For Sci 27(2):149–161Google Scholar
- Huang CJ, Yang YJ, Yang DX, Chen YJ, Wei HY (2008) Realization of an intelligent frog call identification agent. In: Proceedings of the second KES international conference on agent and multi-agent systems: technologies and applications, vol 4953, pp 93–102Google Scholar
- Kırbız S, Gunsel B (2012) Perceptually weighted non-negative matrix factorization for blind single-channel music source separation. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), pp 226–229Google Scholar
- Komori T, Katagiri S (1992) Application of a generalized probabilistic descent method to dynamic time warping-based speech recognition. In: IEEE international conference on acoustics, speech, and signal processing, pp 497–500Google Scholar
- Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst 13:556–562Google Scholar
- Mijović B, Vos MD, Gligorijević I, Taelman J, Huffel SV (2010) Source separation from single-channel recordings by combining empirical-mode decomposition and independent component analysis. In: IEEE transactions on biomedical engineering, vol 57, no 9Google Scholar
- Noda JJ, Travieso CM, Sánchez-Rodríguez D, Dutta MK, Singh A (2016) Using bioacoustic signals and support vector machine for automatic classification of insects. In: 2016 3rd international conference on signal processing and integrated networks (SPIN), pp 656–659Google Scholar
- Schmidt MN, Mørup M (2006) Nonnegative matrix factor 2-D deconvolution for blind single channel source separation. In: Proceedings of international conferences independent component analysis and blind signal separation, vol 3889, pp 700–707Google Scholar
- Suzuki M, Honjo T (2015) Spot-forming method by using two shotgun microphones. In: 2015 Asia-Pacific signal and information processing association annual summit and conference (APSIPA), pp 188–191Google Scholar
- Taylor A, Grigg G, Watson G, McCallum H (1996) Monitoring frog communities: an application of machine learning. In: Proceedings of eighth innovative applications of artificial intelligence conference, pp 1564–1569Google Scholar
- Tyagi H, Hegde RM, Murthy HA, Prabhaka A (2006) Automatic identification of bird calls using spectral ensemble average voice prints. In: Proceedings of the thirteenth european signal processing conferenceGoogle Scholar