Unsupervised Bioacoustic Segmentation by Hierarchical Dirichlet Process Hidden Markov Model

  • Vincent Roger
  • Marius Bartcus
  • Faicel Chamroukhi
  • Hervé Glotin
Part of the Multimedia Systems and Applications book series (MMSA)


Bioacoustics is powerful for monitoring biodiversity. We investigate in this paper automatic segmentation model for real-world bioacoustic scenes in order to infer hidden states referred as song units. Nevertheless, the number of these acoustic units is often unknown, unlike in human speech recognition. Hence, we propose a bioacoustic segmentation based on the Hierarchical Dirichlet Process (HDP-HMM), a Bayesian non-parametric (BNP) model to tackle this challenging problem. Hence, we focus our approach on unsupervised learning from bioacoustic sequences. It consists in simultaneously finding the structure of hidden song units, and automatically infers the unknown number of the hidden states. We investigate two real bioacoustic scenes: whale, and multi-species birds songs. We learn the models using Markov-Chain Monte Carlo (MCMC) sampling techniques on Mel Frequency Cepstral Coefficients (MFCC). Our results, scored by bioacoustic expert, show that the model generates correct song unit segmentation. This study demonstrates new insights for unsupervised analysis of complex soundscapes and illustrates their potential of chunking non-human animal signals into structured units. This can yield to new representations of the calls of a target species, but also to the structuration of inter-species calls. It gives to experts a tracktable approach for efficient bioacoustic research as requested in Kershenbaum et al. (Biol Rev 91(1):13–52, 2016).



We would like to thanks Provence-Alpes-Côte d’Azur region and NortekMed for their financial support for Vincent ROGER. We also thank GDR CNRS MADICS for its support. We thank G. Pavan for its expertise, J. Sueur, F. Deroussen, F. Jiguet for the coorganisation of the challenges and M. Roch for her collaboration.


  1. 1.
    Bartcus, M., Chamroukhi, F., & Glotin, H. (2015, July). Hierarchical Dirichlet Process Hidden Markov Model for Unsupervised Bioacoustic Analysis. In Neural Networks (IJCNN), 2015 International Joint Conference on pp. 1–7. IEEE.Google Scholar
  2. 2.
    Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statistica sinica, 639–650.Google Scholar
  3. 3.
    Kershenbaum, A., Blumstein, D.T., Roch, M.A., Akçay, Ç., Backus, G., Bee, M.A., Bohn, K., Cao, Y., Carter, G., Cäsar, C. and Coen, M. (2016). Acoustic sequences in non-human animals: a tutorial review and prospectus. Biological Reviews, 91(1), pp.13–52.Google Scholar
  4. 4.
    Rabiner, L. and Juang, B. (1986). An introduction to hidden Markov models. ieee assp magazine, 3(1), pp.4–16.Google Scholar
  5. 5.
    Schwarz, G. (1978). Estimating the dimension of a model. The annals of statistics, 6(2), pp.461–464.Google Scholar
  6. 6.
    Akaike, H. (1974). A new look at the statistical model identification. IEEE transactions on automatic control, 19(6), 716–723.Google Scholar
  7. 7.
    Teh, Yee Whye and Jordan, Michael I. and Beal, Matthew J. and Blei, David M. (2006). Hierarchical Dirichlet Processes. Journal of the American Statistical Association, 476(101), pp.1566–1581.Google Scholar
  8. 8.
    Beal, M. J., Ghahramani, Z., & Rasmussen, C. E. (2002). The infinite hidden Markov model. In Advances in neural information processing systems pp. 577–584.Google Scholar
  9. 9.
    Fox, E. B., Sudderth, E. B., Jordan, M. I., & Willsky, A. S. (2008, July). An HDP-HMM for systems with state persistence. In Proceedings of the 25th international conference on Machine learning pp. 312–319. ACM.Google Scholar
  10. 10.
    Helweg, D.A., Cat, D.H., Jenkins, P.F., Garrigue, C. and McCauley, R.D. (1998). Geograpmc Variation in South Pacific Humpback Whale Songs. Behaviour, 135(1), pp.1–27.Google Scholar
  11. 11.
    Medrano, L., Salinas, M., Salas, I., Guevara, P.L.D., Aguayo, A., Jacobsen, J. and Baker, C.S. (1994). Sex identification of humpback whales, Megaptera novaeangliae, on the wintering grounds of the Mexican Pacific Ocean. Canadian journal of zoology, 72(10), pp.1771–1774.Google Scholar
  12. 12.
    Frankel, A.S., Clark, C.W., Herman, L. and Gabriele, C.M. (1995). Spatial distribution, habitat utilization, and social interactions of humpback whales, Megaptera novaeangliae, off Hawai’i, determined using acoustic and visual techniques. Canadian Journal of Zoology, 73(6), pp.1134–1146.Google Scholar
  13. 13.
    Baker, C.S. and Herman, L.M. (1984). Aggressive behavior between humpback whales (Megaptera novaeangliae) wintering in Hawaiian waters. Canadian journal of zoology, 62(10), pp.1922–1937.Google Scholar
  14. 14.
    Garland, E.C., Goldizen, A.W., Rekdahl, M.L., Constantine, R., Garrigue, C., Hauser, N.D., Poole, M.M., Robbins, J. and Noad, M.J. (2011). Dynamic horizontal cultural transmission of humpback whale song at the ocean basin scale. Current Biology, 21(8), pp.687–691.Google Scholar
  15. 15.
    Catchpole, C.K. and Slater, P.J., 86. B. (1995). Birdsong: Biological Themes and Variations. Cambridge University PressCatchpole.Google Scholar
  16. 16.
    Kroodsma, D. E., & Miller, E. H. (Eds.). (1996). Ecology and evolution of acoustic communication in birds pp. 269–281. Comstock Pub.Google Scholar
  17. 17.
    Pace, F., Benard, F., Glotin, H., Adam, O. and White, P. (2010). Subunit definition and analysis for humpback whale call classification. Applied Acoustics, 71(11), pp.1107–1112.Google Scholar
  18. 18.
    Picot, G., Adam, O., Bergounioux, M., Glotin, H. and Mayer, F.X. (2008, October). Automatic prosodic clustering of humpback whales song. In New Trends for Environmental Monitoring Using Passive Systems, 2008 pp. 1–6. IEEE.Google Scholar
  19. 19.
    Glotin, H., LeCun, Y., Artieres, T., Mallat, S., Tchernichovski, O., & Halkias, X. (2013). Neural information processing scaled for bioacoustics, from neurons to big data. USA (2013).
  20. 20.
    Deroussen F., Jiguet F. (2006). La sonotheque du Museum: Oiseaux de France. Nashvert Production, Charenton, France.Google Scholar
  21. 21.
    Baum, L.E., Petrie, T., Soules, G. and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The annals of mathematical statistics, 41(1), pp.164–171.Google Scholar
  22. 22.
    Biernacki, C., Celeux, G. and Govaert, G. (2000). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE transactions on pattern analysis and machine intelligence, 22(7), pp.719–725.Google Scholar
  23. 23.
    Ferguson, T.S. (1973). A Bayesian analysis of some nonparametric problems. The annals of statistics, pp.209–230.Google Scholar
  24. 24.
    Pitman, J. (1995). Exchangeable and partially exchangeable random partitions. Probability theory and related fields, 102(2), pp.145–158.Google Scholar
  25. 25.
    Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2003). Bayesian Data Analysis, (Chapman & Hall/CRC Texts in Statistical Science).Google Scholar
  26. 26.
    Strehl, A. and Ghosh, J. (2002). Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of machine learning research, 3(Dec), pp.583–617.Google Scholar
  27. 27.
    Jordan, M.I., Ghahramani, Z., Jaakkola, T.S. and Saul, L.K. (1999). An introduction to variational methods for graphical models. Machine learning, 37(2), pp.183–233.Google Scholar
  28. 28.
    Foti, N., Xu, J., Laird, D., & Fox, E. (2014). Stochastic variational inference for hidden Markov models. In Advances in neural information processing systems, pp.3599–3607.Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Vincent Roger
    • 1
  • Marius Bartcus
    • 1
  • Faicel Chamroukhi
    • 2
  • Hervé Glotin
    • 1
  1. 1.DYNI Team, DYNI, Aix Marseille Univ, Université de Toulon, CNRS, LISMarseilleFrance
  2. 2.LMNO UMR CNRS, Statistics and Data ScienceUniversity of CaenCaenFrance

Personalised recommendations