Adaptive Spiking Neural Networks for Audiovisual Pattern Recognition

  • Simei Gomes Wysoski
  • Lubica Benuskova
  • Nikola Kasabov
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4985)


The paper describes the integration of brain-inspired systems to perform audiovisual pattern recognition tasks. Individual sensory pathways as well as the integrative modules are implemented using a fast version of spiking neurons grouped in evolving spiking neural network (ESNN) architectures capable of lifelong adaptation. We design a new crossmodal integration system, where individual modalities can influence others before individual decisions are made, fact that resembles some characteristics of the biological brains. The system is applied to the person authentication problem. Preliminary results show that the integrated system can improve the accuracy in many operation points as well as it enables a range of multi-criteria optimizations.


Spiking Neural Networks Multi-modal Information Processing Face and Speaker Recognition Visual and Auditory Integration 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Calvert, G.A.: Crossmodal processing in the human brain: insights from functional neuroimaging studies. Cerebral Cortex 11, 1110–1123 (2001)CrossRefGoogle Scholar
  2. 2.
    von Kriegstein, K., Kleinschmidt, A., Sterzer, P., Giraud, A.: Interaction of face and voice areas during speaker recognition. Journal of Cognitive Neuroscience 17(3), 367–376 (2005)CrossRefGoogle Scholar
  3. 3.
    von Kriegstein, K., Giraud, A.: Implicit multisensory associations influence voice recognition. Plos Biology 4(10), 1809–1820 (2006)CrossRefGoogle Scholar
  4. 4.
    Stein, B.E., Meredith, M.A.: The merging of the senses. MIT Press, Cambridge (1993)Google Scholar
  5. 5.
    Sharkey, A.: Combining artificial neural nets: ensemble and modular multi-net systems. Springer, Heidelberg (1999)zbMATHGoogle Scholar
  6. 6.
    Sanderson, C., Paliwal, K.K.: Identity verification using speech and face information. Digital Signal Processing 14, 449–480 (2004)CrossRefGoogle Scholar
  7. 7.
    Ross, A., Jain, A.K.: Information fusion in biometrics. Pattern Recognition Letters 24(13), 2115–2125 (2003)CrossRefGoogle Scholar
  8. 8.
    Kasabov, N., Postma, E., van den Herik, J.: AVIS: A connectionist-based framework for integrated auditory and visual information processing. Information Sciences 123, 127–148 (2000)zbMATHCrossRefGoogle Scholar
  9. 9.
    Delorme, A., Gautrais, J., van Rullen, R., Thorpe, S.: SpikeNet: a simulator for modeling large networks of integrate and fire neurons. Neurocomputing 26(27), 989–996 (1999)CrossRefGoogle Scholar
  10. 10.
    Wysoski, S.G., Benuskova, L., Kasabov, N.: On-line learning with structural adaptation in a network of spiking neurons for visual pattern recognition. In: Kollias, S., Stafylopatis, A., Duch, W., Oja, E. (eds.) ICANN 2006. LNCS, vol. 4131, pp. 61–70. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Wysoski, S.G., Benuskova, L., Kasabov, N.: Fast and adaptive network of spiking neurons for multi-view visual pattern recognition. Neurocomputing (under review, 2007)Google Scholar
  12. 12.
    Wysoski, S.G., Benuskova, L., Kasabov, N.: Text-independent speaker authentication with spiking neural networks. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D. (eds.) ICANN 2007. LNCS, vol. 4669, pp. 758–767. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  13. 13.
    Burileanu, C., Moraru, D., Bojan, L., Puchiu, M., Stan, A.: On performance improvement of a speaker verification system using vector quantization, cohorts and hybrid cohort-world models. International Journal of Speech Technology 5, 247–257 (2002)zbMATHCrossRefGoogle Scholar
  14. 14.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)CrossRefGoogle Scholar
  15. 15.
    Bimbot, F., et al.: A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing 4, 430–451 (2004)CrossRefGoogle Scholar
  16. 16.
    Viola, P., Jones, M.J.: Rapid object detection using a boosted cascade of simple features. Proc. IEEE CVPR 1, 511–518 (2001)Google Scholar
  17. 17.
  18. 18.
    Delorme, A., Perrinet, L., Thorpe, S.: Networks of integrate-and-fire neurons using Rank Order Coding. Neurocomputing, 38–48 (2001)Google Scholar
  19. 19.
    Burton, A.M., Bruce, V., Johnston, R.A.: Understanding face recognition with an interactive activation model. British Journal of Psychology 81, 361–380 (1990)Google Scholar
  20. 20.
    Ellis, H.D., Jones, D.M., Mosdell, N.: Intra- and inter-modal repetition priming of familiar faces and voices. British Journal of Psycology 88, 143–156 (1997)Google Scholar
  21. 21.
    Ellis, A.W., Young, A.W., Hay, D.C.: Modelling the recognition of faces and words. In: Morris, P.E. (ed.) Modelling Cognition, Wiley, London (1987)Google Scholar
  22. 22.
    McIntosh, A.R., Cabeza, R.E., Lobaugh, N.J.: Analysis of neural interactions explains the activation of occipital cortex by an auditory stimulus. Journal of Neurophysiology 80, 2790–2796 (1998)Google Scholar
  23. 23.
    Gonzalo, D., Shallice, T., Dolan, R.: Time-dependent changes in learning audiovisual associations: a single-trial fMRI study. NeuroImage 11, 243–255 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Simei Gomes Wysoski
    • 1
  • Lubica Benuskova
    • 1
  • Nikola Kasabov
    • 1
  1. 1.Knowledge Engineering and Discovery Research InstituteAuckland University of TechnologyAucklandNew Zealand

Personalised recommendations