Skip to main content

Brain-Like Evolving Spiking Neural Networks for Multimodal Information Processing

  • Chapter

Part of the book series: Studies in Computational Intelligence ((SCI,volume 266))

Abstract

Despite of much evidence suggesting how and where sensory information converge in the human brain, the neural mechanisms of interaction among modalities at the level of neuronal cells and ensembles are still not well understood. The chapter explores emulation of multimodal information processing in a brain-like manner through evolving spiking neural network (ESNN) architectures that use several multimodal characteristics of the biological brains, e.g., multisensory neurons, crossmodal connections, capacity of lifelong adaptation and evolution, adaptive pattern recognition. Illustration is given on audiovisual ESNN for the person authentication problem. Preliminary results show that the integrated system can improve the accuracy in many operation points as well as it enables a range of multi-criteria optimizations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ben-Yacoub, S., Abdeljaoued, Y., Mayoraz, E.: Fusion of face and speech data for person identity verification. Martigny-Valais-Suisse, IDIAP-RR 99-03 (1999)

    Google Scholar 

  2. Bimbot, F., Bonastre, J.-F., Fredouille, C., et al.: A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing 7(1), 430–451 (2004)

    Google Scholar 

  3. Brunelli, R., Falavigna, D.: Person identification using multiple cues. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(10), 955–966 (1995)

    Article  Google Scholar 

  4. Burileanu, C., Moraru, D., Bojan, L., et al.: On performance improvement of a speaker verification system using vector quantization, cohorts and hybrid cohort-world models. International Journal of Speech Technology 5, 247–257 (2002)

    Article  MATH  Google Scholar 

  5. Burton, A.M., Bruce, V., Johnston, R.A.: Understanding face recognition with an interactive activation model. British Journal of Psychology 81, 361–380 (1990)

    Google Scholar 

  6. Calvert, G.A.: Crossmodal processing in the human brain: insights from functional neuroimaging studies. Cerebral Cortex 11, 1110–1123 (2001)

    Article  Google Scholar 

  7. Chevallier, S., Paugam-Moisy, H., Lemaitre, F.: Distributed processing for modelling real-time multimodal perception in a virtual robot. In: Proc. International Multi-Conference Parallel and Distributed Computing and Networks, Innsbruck, pp. 393–398 (2005)

    Google Scholar 

  8. Chibelushi, C.C., Deravi, F., Mason, J.S.D.: A review of speech-based bimodal recognition. IEEE Transactions on Multimedia 4(1), 23–37 (2002)

    Article  Google Scholar 

  9. Crepet, A., Paugam-Moisy, H., Reynaud, E., et al.: A modular neural model for binding several modalities. In: Proc. International Conference on Artificial Intelligence (ICAI), pp. 921–928 (2000)

    Google Scholar 

  10. Delorme, A., Gautrais, J., van Rullen, R., et al.: SpikeNET: a simulator for modeling large networks of integrate and fire neurons. Neurocomputing 26(27), 989–996 (1999)

    Article  Google Scholar 

  11. Ellis, A.W., Young, A.W., Hay, D.C.: Modelling the recognition of faces and words. In: Morris, P.E. (ed.) Modelling Cognition. Wiley, New York (1987)

    Google Scholar 

  12. Ellis, H.D., Jones, D.M., Mosdell, N.: Intra- and inter-modal repetition priming of familiar faces and voices. British Journal of Psychology 88, 143–156 (1997)

    Google Scholar 

  13. Gerstner, W., Kistler, W.M.: Spiking Neuron Models. Cambridge Univ. Press, Cambridge (2002)

    MATH  Google Scholar 

  14. Gonzalo, D., Shallice, T., Dolan, R.: Time-dependent changes in learning audiovisual associations: a single-trial fMRI study. NeuroImage 11, 243–255 (2000)

    Article  Google Scholar 

  15. Haller, M., Hyoung-Gook, K., Sikora, T.: Audiovisual anchorperson detection for topic-oriented navigation in broadcast news. In: Proc. IEEE International Conference on Multimedia and Expo, pp. 1817–1820. IEEE, Toronto (2006)

    Chapter  Google Scholar 

  16. Kasabov, N., Postma, E., van den Herik, J.: AVIS: a connectionist-based framework for integrated auditory and visual information processing. Information Sciences 123, 127–148 (2000)

    Article  MATH  Google Scholar 

  17. Maciokas, J., Goodman, P.H.: Large-scale spike-timingdependent-plasticity model of bimodal (audio/visual) processing. Technical Report, Goodman Brain Computation Laboratory. University of Nevada, Reno (2003)

    Google Scholar 

  18. McIntosh, A.R., Cabeza, R.E., Lobaugh, N.J.: Analysis of neural interactions explains the activation of occipital cortex by an auditory stimulus. Journal of Neurophysiology 80, 2790–2796 (1998)

    Google Scholar 

  19. Messer, K., Matas, J., Kittler, J., et al.: XM2VTSDB. The extended M2VTS database. In: Proc. 2nd International Conference on Audio-Video Based Biometric Person Authentication, Washington, pp. 72–77 (1999)

    Google Scholar 

  20. Park, C., Choi, T., Kim, Y., et al.: Multi-modal human verification using face and speech. In: Proc. IEEE Interantional Conference on Computer Vision Systems (ICVS), pp. 54–59 (2006)

    Google Scholar 

  21. Poggio, T., Girosi, F.: Regularization algorithms for learning that are equivalent to multilayer networks. Science 247, 978–982 (1990)

    Article  MathSciNet  Google Scholar 

  22. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)

    Article  Google Scholar 

  23. Rosenberg, A.E., Soong, F.K.: Evaluation of a vector quantization talker recognition system in text independent and text dependent modes. Computer Speech and Language 2(3-4), 143–157 (1987)

    Article  Google Scholar 

  24. Ross, A., Jain, A.K.: Information fusion in biometrics. Pattern Recognition Letters 24(13), 2115–2125 (2003)

    Article  Google Scholar 

  25. Sanderson, C., Paliwal, K.K.: Identity verification using speech and face information. Digital Signal Processing 14, 449–480 (2004)

    Article  Google Scholar 

  26. Séguier, R., Mercier, D.: Audio-visual speech recognition one pass learning with spiking neurons. In: Dorronsoro, J.R. (ed.) ICANN 2002. LNCS, vol. 2415, p. 1207. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  27. Sharkey, A.: Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems. Springer, New York (1999)

    MATH  Google Scholar 

  28. Stein, B.E., Meredith, M.A.: The Merging of the Senses. The MIT Press, Cambridge (1993)

    Google Scholar 

  29. Thorpe, S.J., Fabre-Thorpe, M.: Seeking categories in the brain. Science 291, 260–262 (2001)

    Article  Google Scholar 

  30. Viola, P., Jones, M.: Rapid Object Detection using a Boosted Cascade of Simple Features. In: Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), vol. 1, pp. 511–517 (2001)

    Google Scholar 

  31. von Kriegstein, K., Kleinschmidt, A., Sterzer, P., et al.: Interaction of face and voice areas during speaker recognition. Journal of Cognitive Neuroscience 17(3), 367–376 (2005)

    Article  Google Scholar 

  32. von Kriegstein, K., Giraud, A.: Implicit multisensory associations influence voice recognition. Plos Biology 4(10), 1809–1820 (2006)

    Google Scholar 

  33. Wysoski, S.G., Benuskova, L., Kasabov, N.: On-line learning with structural adaptation in a network of spiking neurons for visual pattern recognition. In: Kollias, S.D., Stafylopatis, A., Duch, W., Oja, E. (eds.) ICANN 2006. LNCS, vol. 4131, pp. 61–70. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  34. Wysoski, S.G., Benuskova, L., Kasabov, N.: Fast and adaptive network of spiking for multi-view visual pattern recognition. Neurocomputing (2007) (under review)

    Google Scholar 

  35. Wysoski, S.G., Benuskova, L., Kasabov, N.: Text-independent speaker authentication with spiking neural networks. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D.P. (eds.) ICANN 2007. LNCS, vol. 4669, pp. 758–767. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Wysoski, S.G., Benuskova, L., Kasabov, N. (2010). Brain-Like Evolving Spiking Neural Networks for Multimodal Information Processing. In: Hanazawa, A., Miki, T., Horio, K. (eds) Brain-Inspired Information Technology. Studies in Computational Intelligence, vol 266. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04025-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04025-2_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04024-5

  • Online ISBN: 978-3-642-04025-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics