ICONIP 2015: Neural Information Processing pp 348-355 | Cite as
A Cortically-Inspired Model for Bioacoustics Recognition
Abstract
Wavelet transforms have shown superior performance in auditory recognition tasks compared to the more commonly used Mel-Frequency Cepstral Coefficients, and offer the ability to more closely model the frequency response behaviour of the cochlear basilar membrane. In this paper we evaluate a gammatone wavelet as a preprocessor for the Hierarchical Temporal Memory (HTM) model of the neocortex as part of the broader development of a biologically motivated approach to sound recognition. Specifically, we apply for the first time, a gammatone/equivalent rectangular bandwidth wavelet transform in conjunction with the HTM’s Spatial Pooler to recognise frog calls, bird songs and insect sounds. Our audio feature detection results show that wavelets perform considerably better than MFCCs on our selected datasets but that combining wavelets with HTM does not produce further improvements. This outcome raises questions concerning the degree of match to the biology required for an effective HTM-based model of audition.
Keywords
Signal processing Wavelet transforms Bioacoustics Machine learning Spatial pooling Hierarchical temporal memory k-NN classifierReferences
- 1.Murty, K.S.R., Yegnanarayana, B.: Combining evidence from residual phase and MFCC features for speaker recognition. IEEE Signal Process. Lett. 13(1), 52–55 (2006)CrossRefGoogle Scholar
- 2.Strang, G.: Wavelet transforms versus Fourier transforms. Bull. Am. Math. Soc. 28(2), 288–305 (1993)MathSciNetCrossRefMATHGoogle Scholar
- 3.Mallat, S.: A Wavelet Tour of Signal Processing: The Sparse Way. Academic Press, Burlington (2008)MATHGoogle Scholar
- 4.Hawkins, J., Blakeslee, S.: On Intelligence. Henry Holt, New York (2004)Google Scholar
- 5.Friston, K.: The free-energy principle: a unified brain theory? Nat. Rev. Neurosci. 11(2), 127–138 (2010)CrossRefGoogle Scholar
- 6.Towsey, M.W., Planitz, B., Nantes, A., Wimmer, J., Roe, P.: A toolbox for animal call recognition. Bioacoustics Int. J. Anim. Sound Record. 21(2), 107–125 (2012)Google Scholar
- 7.Mitrovic, D., Zeppelzauer, M. and Breiteneder, C.: Discrimination and retrieval of animal sounds. In: 12th International Multi-Media Modelling Conference Proceedings, pp. 339–343 (2006)Google Scholar
- 8.Gonzalez, R.: Better than MFCC audio classification features. In: Jin, J.S., Xu, C., Xu, M. (eds.) The Era of Interactive Media, pp. 291–301. Springer, Heidelberg (2013)CrossRefGoogle Scholar
- 9.Stowell, D., Plumbley, M.D.: Audio-only bird classification using unsupervised feature learning. In: Working Notes of CLEF 2014 Conference (2014)Google Scholar
- 10.Seekings, P., Potter, J.R.: Classification of marine acoustic signals using wavelets and neural networks. In: Proceeding of 8th Western Pacific Acoustics Conference (Wespac8) (2003)Google Scholar
- 11.Mirzaei, G., Majid, M.W., Ross, J., Jamali, M.M., Gorsevski, P.V., Frizado, J.P., Bingman, V.P.: The bio-acoustic feature extraction and classification of bat echolocation calls. In: 2012 IEEE International Conference on Electro/Information Technology (EIT), pp. 1–4 (2012)Google Scholar
- 12.Usevitch, B.E.: A tutorial on modern lossy wavelet image compression: foundations of JPEG 2000. IEEE Signal Process. Mag. 18(5), 22–35 (2001)CrossRefGoogle Scholar
- 13.Daubechies, I.: Where do wavelets come from? a personal point of view. Proc. IEEE 84(4), 510–513 (1996)CrossRefGoogle Scholar
- 14.Schnupp, J., Nelken, I., King, A.: Auditory Neuroscience: Making Sense of Sound. MIT Press, Cambridge (2011)Google Scholar
- 15.Valero, X., Alías, F.: Gammatone wavelet features for sound classification in surveillance applications. In: 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pp. 1658–1662 (2012)Google Scholar
- 16.Glasberg, B.R., Moore, B.C.J.: Derivation of auditory filter shapes from notched-noise data. Hear. Res. 47(1), 103–138 (1990)CrossRefGoogle Scholar
- 17.Hawkins, J., Ahmad, S., Dubinsky, D.: Hierarchical temporal memory including HTM cortical learning algorithms. Technical report, Numenta Inc, Palto Alto (2011)Google Scholar
- 18.Cowley, B., Kneller, A., Thornton, J.: Cortically-inspired overcomplete feature learning for colour images. In: Pham, D.-N., Park, S.-B. (eds.) PRICAI 2014. LNCS, vol. 8862, pp. 720–732. Springer, Heidelberg (2014) Google Scholar
- 19.Mountcastle, V.B.: Introduction to the special issue on computation in cortical columns. Cereb. Cortex 13(1), 2–4 (2003)CrossRefGoogle Scholar
- 20.Stewart, D.: Nature Sound. Australian Frog Calls: Subtropical East [Audio Recordings]. http://www.naturesound.com.au/cd_frogsSE.htm