Feedback in Multimodal Self-organizing Networks Enhances Perception of Corrupted Stimuli

  • Andrew P. Papliński
  • Lennart Gustafsson
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4304)


It is known from psychology and neuroscience that multimodal integration of sensory information enhances the perception of stimuli that are corrupted in one or more modalities. A prominent example of this is that auditory perception of speech is enhanced when speech is bimodal, i.e. when it also has a visual modality. The function of the cortical network processing speech in auditory and visual cortices and in multimodal association areas, is modeled with a Multimodal Self-Organizing Network (MuSON), consisting of several Kohonen Self-Organizing Maps (SOM) with both feedforward and feedback connections. Simulations with heavily corrupted phonemes and uncorrupted letters as inputs to the MuSON demonstrate a strongly enhanced auditory perception. This is explained by feedback from the bimodal area into the auditory stream, as in cortical processing.


Auditory Cortex Auditory Perception Superior Temporal Sulcus Feedback Connection Auditory Stream 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Sumby, W., Pollack, I.: Visual contribution to speech intelligibility in noise. J. Acoust. Soc. Am. 26, 212–215 (1954)CrossRefGoogle Scholar
  2. 2.
    Calvert, E.G., Spence, C., Stein, B.E.: The handbook of multisensory processes, 1st edn. MIT Press, Cambridge (2004)Google Scholar
  3. 3.
    Schroeder, C., Foxe, J.: The timing and laminar profile of converging inputs to multisensory areas of the macaque neocortex. Cognitive Brain Research 14, 187–198 (2002)CrossRefGoogle Scholar
  4. 4.
    Schroeder, C., Smiley, J., Fu, K., McGinnis, T., O’Connell, M., Hackett, T.: Anatomical mechanisms and functional implications of multisensory convergence in early cortical processing. Int. J. Psychophysiology 50, 5–17 (2003)CrossRefGoogle Scholar
  5. 5.
    Calvert, G.A., Bullmore, E.T., Brammer, M.J., Campbell, R., Williams, S.C., McGuire, P., Woodruff, P.W., Iversen, S.D., David, A.S.: Activation of auditory cortex during silent lipreading. Science 276, 593–596 (1997)CrossRefGoogle Scholar
  6. 6.
    Driver, J., Spence, C.: Crossmodal attention. Curr. Opin. Neurobiol. 8, 245–253 (1998)CrossRefGoogle Scholar
  7. 7.
    Calvert, G.A., Thesen, T.: Multisensory integration: methodological approaches and emerging principles in the human brain. J. Physiology Paris 98, 191–205 (2004)CrossRefGoogle Scholar
  8. 8.
    Foxe, J.J., Schroeder, C.E.: The case for feedforward multisensory convergence during early cortical processing. Neuroreport. 16(5), 419–423 (2005)CrossRefGoogle Scholar
  9. 9.
    Lamme, V., Roelfsema, P.: The distinct modes of vision offered by feedforward and recurrent processing. Trends Neuroscience 23, 571–579 (2000)CrossRefGoogle Scholar
  10. 10.
    Price, C.J.: The anatomy of language: contributions from functional neuroimaging. J. Anat. 197, 335–359 (2000)CrossRefGoogle Scholar
  11. 11.
    Binder, J.R., Frost, J.A., Hammeke, T.A., Bellgowan, P.S.F., Springer, J.A., Kaufman, J.N., Possing, E.T.: Human temporal lobe activation by speech and nonspeech sounds. Cerebral Cortex 10, 512–528 (2000)CrossRefGoogle Scholar
  12. 12.
    Calvert, G.A., Campbell, R.: Reading speech from still and moving faces: The neural substrates of visual speech. J. Cognitive Neuroscience 15(1), 57–70 (2003)CrossRefGoogle Scholar
  13. 13.
    Dehaene-Lambetrz, G., Pallier, C., Serniclaes, W., Sprenger-Charolles, L., Jobert, A., Dehaene, S.: Neural correlates of switching from auditory to speech perception. NeuroImage 24, 21–33 (2005)CrossRefGoogle Scholar
  14. 14.
    Möttönen, R., Calvert, G.A., Jääskeläinen, I., Matthews, P.M., Thesen, T., Tuominen, J., Sams, M.: Perceiving identical sounds as speech or non-speech modulates activity in the left posterior superior temporal sulcus. NeuroImage 19 (2005)Google Scholar
  15. 15.
    Calvert, G., Campbell, R., Brammer, M.: Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex. Current Biology 10, 649–657 (2000)CrossRefGoogle Scholar
  16. 16.
    Frost, R., Repp, B., Katz, L.: Can speech perception be influenced by simultaneous presentation of print? J. Mem. Lang. 27, 741–755 (1988)CrossRefGoogle Scholar
  17. 17.
    Dijkstra, T., Frauenfelder, U.H., Schreuder, R.: Bidirectional grapheme-phoneme activation in a bimodal detection task. J. Physiology Paris 98(3), 191–205 (2004)CrossRefGoogle Scholar
  18. 18.
    Gauthier, I., Tarr, M.J., Moylan, J., Skudlarski, P., Gore, J.C., Anderson, A.W.: The fusiform “face area” is part of a network that processes faces at the individual level. J. Cognitive Neuroscience 12(3), 495–504 (2000)CrossRefGoogle Scholar
  19. 19.
    Polk, T.A., Farah, M.J.: The neural development and organization of letter recognition: Evidence from functional neuroimaging, computational modeling, and behavioral studies. PNAS 98, 847–852 (1998)CrossRefGoogle Scholar
  20. 20.
    Polk, T.A., Stallcup, M., Aguire, G.K., Alsop, D.C., D’Esposito, M., Detre, J.A., Farah, M.J.: Neural specialization for letter recognition. J. Cognitive Neuroscience 14(2), 145–159 (2002)CrossRefGoogle Scholar
  21. 21.
    Raij, T., Uutela, K., Hari, R.: Audiovisual integration of letters in the human brain. Neuron 28, 617–625 (2000)CrossRefGoogle Scholar
  22. 22.
    van Atteveldt, N., Formisano, E., Goebel, R., Blomert, L.: Integration of letters and speech sounds in the human brain. Neuron 43, 271–282 (2004)CrossRefGoogle Scholar
  23. 23.
    Papliński, A.P., Gustafsson, L.: Multimodal feedforward self-organizing maps. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005. LNCS (LNAI), vol. 3801, pp. 81–88. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  24. 24.
    Gustafsson, L., Papliński, A.P.: Bimodal integration of phonemes and letters: an application of multimodal self-organizing networks. In: Proc. Int. Joint Conf. Neural Networks, Vancouver, Canada, pp. 704–710 (2006)Google Scholar
  25. 25.
    Hopfield, J.: Neural networks and physical systems with emergent collective computational properties. PNAS USA 79, 2554–2588 (1982)CrossRefMathSciNetGoogle Scholar
  26. 26.
    Kohonen, T.: Self-Organising Maps, 3rd edn. Springer, Berlin (2001)Google Scholar
  27. 27.
    Rolls, E.T.: Multisensory neuronal convergence of taste, somatosentory, visual, and auditory inputs. In: Calvert, G., Spencer, C., Stein, B.E. (eds.) The Handbook of multisensory processes, pp. 311–331. MIT Press, Cambridge (2004)Google Scholar
  28. 28.
    Kohonen, T.: The neural phonetic typewriter. Computer, 11–22 (1988)Google Scholar
  29. 29.
    Gold, B., Morgan, N.: Speech and audio signal processing. John Wiley & Sons, Inc., New York (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Andrew P. Papliński
    • 1
  • Lennart Gustafsson
    • 2
  1. 1.Clayton School of Information TechnologyMonash UniversityAustralia
  2. 2.Computer Science and Electrical EngineeringLuleå University of TechnologySweden

Personalised recommendations