Skip to main content

Unsupervised Bioacoustic Segmentation by Hierarchical Dirichlet Process Hidden Markov Model

  • Chapter
  • First Online:

Part of the book series: Multimedia Systems and Applications ((MMSA))

Abstract

Bioacoustics is powerful for monitoring biodiversity. We investigate in this paper automatic segmentation model for real-world bioacoustic scenes in order to infer hidden states referred as song units. Nevertheless, the number of these acoustic units is often unknown, unlike in human speech recognition. Hence, we propose a bioacoustic segmentation based on the Hierarchical Dirichlet Process (HDP-HMM), a Bayesian non-parametric (BNP) model to tackle this challenging problem. Hence, we focus our approach on unsupervised learning from bioacoustic sequences. It consists in simultaneously finding the structure of hidden song units, and automatically infers the unknown number of the hidden states. We investigate two real bioacoustic scenes: whale, and multi-species birds songs. We learn the models using Markov-Chain Monte Carlo (MCMC) sampling techniques on Mel Frequency Cepstral Coefficients (MFCC). Our results, scored by bioacoustic expert, show that the model generates correct song unit segmentation. This study demonstrates new insights for unsupervised analysis of complex soundscapes and illustrates their potential of chunking non-human animal signals into structured units. This can yield to new representations of the calls of a target species, but also to the structuration of inter-species calls. It gives to experts a tracktable approach for efficient bioacoustic research as requested in Kershenbaum et al. (Biol Rev 91(1):13–52, 2016).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover + eBook
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Available as EPUB and PDF
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The MFCC are features that represent and compress short-term power spectrum of a sound. It follows the Mel scale.

  2. 2.

    http://sabiod.univ-tln.fr/nips4b/challenge2.html.

  3. 3.

    http://sabiod.univ-tln.fr/icml2013/BIRD_SAMPLES/.

References

  1. Bartcus, M., Chamroukhi, F., & Glotin, H. (2015, July). Hierarchical Dirichlet Process Hidden Markov Model for Unsupervised Bioacoustic Analysis. In Neural Networks (IJCNN), 2015 International Joint Conference on pp. 1–7. IEEE.

    Google Scholar 

  2. Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statistica sinica, 639–650.

    Google Scholar 

  3. Kershenbaum, A., Blumstein, D.T., Roch, M.A., Akçay, Ç., Backus, G., Bee, M.A., Bohn, K., Cao, Y., Carter, G., Cäsar, C. and Coen, M. (2016). Acoustic sequences in non-human animals: a tutorial review and prospectus. Biological Reviews, 91(1), pp.13–52.

    Google Scholar 

  4. Rabiner, L. and Juang, B. (1986). An introduction to hidden Markov models. ieee assp magazine, 3(1), pp.4–16.

    Google Scholar 

  5. Schwarz, G. (1978). Estimating the dimension of a model. The annals of statistics, 6(2), pp.461–464.

    Google Scholar 

  6. Akaike, H. (1974). A new look at the statistical model identification. IEEE transactions on automatic control, 19(6), 716–723.

    Google Scholar 

  7. Teh, Yee Whye and Jordan, Michael I. and Beal, Matthew J. and Blei, David M. (2006). Hierarchical Dirichlet Processes. Journal of the American Statistical Association, 476(101), pp.1566–1581.

    Google Scholar 

  8. Beal, M. J., Ghahramani, Z., & Rasmussen, C. E. (2002). The infinite hidden Markov model. In Advances in neural information processing systems pp. 577–584.

    Google Scholar 

  9. Fox, E. B., Sudderth, E. B., Jordan, M. I., & Willsky, A. S. (2008, July). An HDP-HMM for systems with state persistence. In Proceedings of the 25th international conference on Machine learning pp. 312–319. ACM.

    Google Scholar 

  10. Helweg, D.A., Cat, D.H., Jenkins, P.F., Garrigue, C. and McCauley, R.D. (1998). Geograpmc Variation in South Pacific Humpback Whale Songs. Behaviour, 135(1), pp.1–27.

    Google Scholar 

  11. Medrano, L., Salinas, M., Salas, I., Guevara, P.L.D., Aguayo, A., Jacobsen, J. and Baker, C.S. (1994). Sex identification of humpback whales, Megaptera novaeangliae, on the wintering grounds of the Mexican Pacific Ocean. Canadian journal of zoology, 72(10), pp.1771–1774.

    Google Scholar 

  12. Frankel, A.S., Clark, C.W., Herman, L. and Gabriele, C.M. (1995). Spatial distribution, habitat utilization, and social interactions of humpback whales, Megaptera novaeangliae, off Hawai’i, determined using acoustic and visual techniques. Canadian Journal of Zoology, 73(6), pp.1134–1146.

    Google Scholar 

  13. Baker, C.S. and Herman, L.M. (1984). Aggressive behavior between humpback whales (Megaptera novaeangliae) wintering in Hawaiian waters. Canadian journal of zoology, 62(10), pp.1922–1937.

    Google Scholar 

  14. Garland, E.C., Goldizen, A.W., Rekdahl, M.L., Constantine, R., Garrigue, C., Hauser, N.D., Poole, M.M., Robbins, J. and Noad, M.J. (2011). Dynamic horizontal cultural transmission of humpback whale song at the ocean basin scale. Current Biology, 21(8), pp.687–691.

    Google Scholar 

  15. Catchpole, C.K. and Slater, P.J., 86. B. (1995). Birdsong: Biological Themes and Variations. Cambridge University PressCatchpole.

    Google Scholar 

  16. Kroodsma, D. E., & Miller, E. H. (Eds.). (1996). Ecology and evolution of acoustic communication in birds pp. 269–281. Comstock Pub.

    Google Scholar 

  17. Pace, F., Benard, F., Glotin, H., Adam, O. and White, P. (2010). Subunit definition and analysis for humpback whale call classification. Applied Acoustics, 71(11), pp.1107–1112.

    Google Scholar 

  18. Picot, G., Adam, O., Bergounioux, M., Glotin, H. and Mayer, F.X. (2008, October). Automatic prosodic clustering of humpback whales song. In New Trends for Environmental Monitoring Using Passive Systems, 2008 pp. 1–6. IEEE.

    Google Scholar 

  19. Glotin, H., LeCun, Y., Artieres, T., Mallat, S., Tchernichovski, O., & Halkias, X. (2013). Neural information processing scaled for bioacoustics, from neurons to big data. USA (2013). http://sabiod.org/NIPS4B2013_book.pdf.

  20. Deroussen F., Jiguet F. (2006). La sonotheque du Museum: Oiseaux de France. Nashvert Production, Charenton, France.

    Google Scholar 

  21. Baum, L.E., Petrie, T., Soules, G. and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The annals of mathematical statistics, 41(1), pp.164–171.

    Google Scholar 

  22. Biernacki, C., Celeux, G. and Govaert, G. (2000). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE transactions on pattern analysis and machine intelligence, 22(7), pp.719–725.

    Google Scholar 

  23. Ferguson, T.S. (1973). A Bayesian analysis of some nonparametric problems. The annals of statistics, pp.209–230.

    Google Scholar 

  24. Pitman, J. (1995). Exchangeable and partially exchangeable random partitions. Probability theory and related fields, 102(2), pp.145–158.

    Google Scholar 

  25. Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2003). Bayesian Data Analysis, (Chapman & Hall/CRC Texts in Statistical Science).

    Google Scholar 

  26. Strehl, A. and Ghosh, J. (2002). Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of machine learning research, 3(Dec), pp.583–617.

    Google Scholar 

  27. Jordan, M.I., Ghahramani, Z., Jaakkola, T.S. and Saul, L.K. (1999). An introduction to variational methods for graphical models. Machine learning, 37(2), pp.183–233.

    Google Scholar 

  28. Foti, N., Xu, J., Laird, D., & Fox, E. (2014). Stochastic variational inference for hidden Markov models. In Advances in neural information processing systems, pp.3599–3607.

    Google Scholar 

Download references

Acknowledgements

We would like to thanks Provence-Alpes-Côte d’Azur region and NortekMed for their financial support for Vincent ROGER. We also thank GDR CNRS MADICS http://sabiod.org/EADM for its support. We thank G. Pavan for its expertise, J. Sueur, F. Deroussen, F. Jiguet for the coorganisation of the challenges and M. Roch for her collaboration.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Roger, V., Bartcus, M., Chamroukhi, F., Glotin, H. (2018). Unsupervised Bioacoustic Segmentation by Hierarchical Dirichlet Process Hidden Markov Model. In: Joly, A., Vrochidis, S., Karatzas, K., Karppinen, A., Bonnet, P. (eds) Multimedia Tools and Applications for Environmental & Biodiversity Informatics. Multimedia Systems and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-76445-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-76445-0_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-76444-3

  • Online ISBN: 978-3-319-76445-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics