Multiple-Instance Multiple-Label Learning for the Classification of Frog Calls with Acoustic Event Detection

  • Jie Xie
  • Michael Towsey
  • Liang Zhang
  • Kiyomi Yasumiba
  • Lin Schwarzkopf
  • Jinglan Zhang
  • Paul Roe
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9680)

Abstract

Frog call classification has received increasing attention due to its importance for ecosystem. Traditionally, the classification of frog calls is solved by means of the single-instance single-label classification classifier. However, since different frog species tend to call simultaneously, classifying frog calls becomes a multiple-instance multiple-label learning problem. In this paper, we propose a novel method for the classification of frog species using multiple-instance multiple-label (MIML) classifiers. To be specific, continuous recordings are first segmented into audio clips (10 s). For each audio clip, acoustic event detection is used to segment frog syllables. Then, three feature sets are extracted from each syllable: mask descriptor, profile statistics, and the combination of mask descriptor and profile statistics. Next, a bag generator is applied to those extracted features. Finally, three MIML classifiers, MIML-SVM, MIML-RBF, and MIML-kNN, are employed for tagging each audio clip with different frog species. Experimental results show that our proposed method can achieve high accuracy (81.8 % true positive/negatives) for frog call classification.

Keywords

Frog call classification Acoustic event detection Multiple-instance multiple-label learning 

References

  1. 1.
    Wimmer, J., Towsey, M., Planitz, B., Williamson, I., Roe, P.: Analysing environmental acoustic data through collaboration and automation. Future Gener. Comput. Syst. 29(2), 560–568 (2013)CrossRefGoogle Scholar
  2. 2.
    Han, N.C., Muniandy, S.V., Dayou, J.: Acoustic classification of Australian Anurans based on hybrid spectral-entropy approach. Appl. Acoust. 72(9), 639–645 (2011)CrossRefGoogle Scholar
  3. 3.
    Gingras, B., Fitch, W.T.: A three-parameter model for classifying Anurans into four genera based on advertisement calls. J. Acoust. Soc. Am. 133(1), 547–559 (2013)CrossRefGoogle Scholar
  4. 4.
    Bedoya, C., Isaza, C., Daza, J.M., López, J.D.: Automatic recognition of Anuran species based on syllable identification. Ecol. Inf. 24, 200–209 (2014)CrossRefGoogle Scholar
  5. 5.
    Xie, J., Towsey, M., Truskinger, A., Eichinski, P., Zhang, J., Roe, P.: Acoustic classification of Australian Anurans using syllable features. In: 2015 IEEE Tenth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (IEEE ISSNIP 2015), Singapore, April 2015Google Scholar
  6. 6.
    Zhou, Z.-H.Z.M.-L.: Multi-instance multi-label learning with application to scene classification. In: Advances in Neural Information Processing Systems, pp. 1609–1616 (2007)Google Scholar
  7. 7.
    Briggs, F., Lakshminarayanan, B., Neal, L., Fern, X.Z., Raich, R., Hadley, S.J., Hadley, A.S., Betts, M.G.: Acoustic classification of multiple simultaneous bird species: a multi-instance multi-label approach. J. Acoust. Soc. Am. 131(6), 4640–4650 (2012)CrossRefGoogle Scholar
  8. 8.
    Zhang, M.-L., Wang, Z.-J.: MIMLRBF: RBF neural networks for multi-instance multi-label learning. Neurocomputing 72(16), 3951–3956 (2009)CrossRefGoogle Scholar
  9. 9.
    Zhang, M.-L.: A k-nearest neighbor based multi-instance multi-label learning algorithm. In: 22nd IEEE International Conference on Tools with Artificial Intelligence (ICTAI), vol. 2, pp. 207–212. IEEE (2010)Google Scholar
  10. 10.
    Somervuo, P., et al.: Classification of the harmonic structure in bird vocalization. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), vol. 5, pp. V–701. IEEE (2004)Google Scholar
  11. 11.
    Huang, C.-J., Yang, Y.-J., Yang, D.-X., Chen, Y.-J.: Frog classification using machine learning techniques. Expert Syst. Appl. 36(2), 3737–3743 (2009)CrossRefGoogle Scholar
  12. 12.
    Towsey, M., Planitz, B., Nantes, A., Wimmer, J., Roe, P.: A toolbox for animal call recognition. Bioacoustics 21(2), 107–125 (2012)CrossRefGoogle Scholar
  13. 13.
    Xie, J., Towsey, M., Zhang, J., Roe, P.: Image processing and classification procedure for the analysis of Australian frog vocalisations. In: Proceedings of the 2nd International Workshop on Environmental Multimedia Retrieval, ser. EMR 2015, New York, NY, USA, pp. 15–20. ACM (2015)Google Scholar
  14. 14.
    Mallawaarachchi, A., Ong, S., Chitre, M., Taylor, E.: Spectrogram denoising and automated extraction of the fundamental frequency variation of dolphin whistles. J. Acoust. Soc. Am. 124(2), 1159–1170 (2008)CrossRefGoogle Scholar
  15. 15.
    Zhou, Z.-H., Zhang, M.-L., Huang, S.-J., Li, Y.-F.: MIML: a framework for learning with ambiguous objects. CORR abs/0808.3231 (2008)Google Scholar
  16. 16.
    Dimou, A., Tsoumakas, G., Mezaris, V., Kompatsiaris, I., Vlahavas, I.: An empirical study of multi-label learning methods for video annotation. In: Seventh International Workshop on Content-Based Multimedia Indexing, CBMI 2009, pp. 19–24. IEEE (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Jie Xie
    • 1
  • Michael Towsey
    • 1
  • Liang Zhang
    • 1
  • Kiyomi Yasumiba
    • 1
  • Lin Schwarzkopf
    • 1
  • Jinglan Zhang
    • 1
  • Paul Roe
    • 1
  1. 1.Electrical Engineering and Computer Science SchoolQueensland University of TechnologyBrisbaneAustralia

Personalised recommendations