ICONIP 2015: Neural Information Processing pp 348-355 | Cite as

A Cortically-Inspired Model for Bioacoustics Recognition

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9492)

Abstract

Wavelet transforms have shown superior performance in auditory recognition tasks compared to the more commonly used Mel-Frequency Cepstral Coefficients, and offer the ability to more closely model the frequency response behaviour of the cochlear basilar membrane. In this paper we evaluate a gammatone wavelet as a preprocessor for the Hierarchical Temporal Memory (HTM) model of the neocortex as part of the broader development of a biologically motivated approach to sound recognition. Specifically, we apply for the first time, a gammatone/equivalent rectangular bandwidth wavelet transform in conjunction with the HTM’s Spatial Pooler to recognise frog calls, bird songs and insect sounds. Our audio feature detection results show that wavelets perform considerably better than MFCCs on our selected datasets but that combining wavelets with HTM does not produce further improvements. This outcome raises questions concerning the degree of match to the biology required for an effective HTM-based model of audition.

Keywords

Signal processing Wavelet transforms Bioacoustics Machine learning Spatial pooling Hierarchical temporal memory k-NN classifier 

References

  1. 1.
    Murty, K.S.R., Yegnanarayana, B.: Combining evidence from residual phase and MFCC features for speaker recognition. IEEE Signal Process. Lett. 13(1), 52–55 (2006)CrossRefGoogle Scholar
  2. 2.
    Strang, G.: Wavelet transforms versus Fourier transforms. Bull. Am. Math. Soc. 28(2), 288–305 (1993)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Mallat, S.: A Wavelet Tour of Signal Processing: The Sparse Way. Academic Press, Burlington (2008)MATHGoogle Scholar
  4. 4.
    Hawkins, J., Blakeslee, S.: On Intelligence. Henry Holt, New York (2004)Google Scholar
  5. 5.
    Friston, K.: The free-energy principle: a unified brain theory? Nat. Rev. Neurosci. 11(2), 127–138 (2010)CrossRefGoogle Scholar
  6. 6.
    Towsey, M.W., Planitz, B., Nantes, A., Wimmer, J., Roe, P.: A toolbox for animal call recognition. Bioacoustics Int. J. Anim. Sound Record. 21(2), 107–125 (2012)Google Scholar
  7. 7.
    Mitrovic, D., Zeppelzauer, M. and Breiteneder, C.: Discrimination and retrieval of animal sounds. In: 12th International Multi-Media Modelling Conference Proceedings, pp. 339–343 (2006)Google Scholar
  8. 8.
    Gonzalez, R.: Better than MFCC audio classification features. In: Jin, J.S., Xu, C., Xu, M. (eds.) The Era of Interactive Media, pp. 291–301. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  9. 9.
    Stowell, D., Plumbley, M.D.: Audio-only bird classification using unsupervised feature learning. In: Working Notes of CLEF 2014 Conference (2014)Google Scholar
  10. 10.
    Seekings, P., Potter, J.R.: Classification of marine acoustic signals using wavelets and neural networks. In: Proceeding of 8th Western Pacific Acoustics Conference (Wespac8) (2003)Google Scholar
  11. 11.
    Mirzaei, G., Majid, M.W., Ross, J., Jamali, M.M., Gorsevski, P.V., Frizado, J.P., Bingman, V.P.: The bio-acoustic feature extraction and classification of bat echolocation calls. In: 2012 IEEE International Conference on Electro/Information Technology (EIT), pp. 1–4 (2012)Google Scholar
  12. 12.
    Usevitch, B.E.: A tutorial on modern lossy wavelet image compression: foundations of JPEG 2000. IEEE Signal Process. Mag. 18(5), 22–35 (2001)CrossRefGoogle Scholar
  13. 13.
    Daubechies, I.: Where do wavelets come from? a personal point of view. Proc. IEEE 84(4), 510–513 (1996)CrossRefGoogle Scholar
  14. 14.
    Schnupp, J., Nelken, I., King, A.: Auditory Neuroscience: Making Sense of Sound. MIT Press, Cambridge (2011)Google Scholar
  15. 15.
    Valero, X., Alías, F.: Gammatone wavelet features for sound classification in surveillance applications. In: 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pp. 1658–1662 (2012)Google Scholar
  16. 16.
    Glasberg, B.R., Moore, B.C.J.: Derivation of auditory filter shapes from notched-noise data. Hear. Res. 47(1), 103–138 (1990)CrossRefGoogle Scholar
  17. 17.
    Hawkins, J., Ahmad, S., Dubinsky, D.: Hierarchical temporal memory including HTM cortical learning algorithms. Technical report, Numenta Inc, Palto Alto (2011)Google Scholar
  18. 18.
    Cowley, B., Kneller, A., Thornton, J.: Cortically-inspired overcomplete feature learning for colour images. In: Pham, D.-N., Park, S.-B. (eds.) PRICAI 2014. LNCS, vol. 8862, pp. 720–732. Springer, Heidelberg (2014) Google Scholar
  19. 19.
    Mountcastle, V.B.: Introduction to the special issue on computation in cortical columns. Cereb. Cortex 13(1), 2–4 (2003)CrossRefGoogle Scholar
  20. 20.
    Stewart, D.: Nature Sound. Australian Frog Calls: Subtropical East [Audio Recordings]. http://www.naturesound.com.au/cd_frogsSE.htm

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Cognitive Computing Unit, Institute for Integrated and Intelligent SystemsGriffith UniversityGold CoastAustralia

Personalised recommendations