Audition as a Trigger of Head Movements
- 296 Downloads
In multimodal realistic environments, audition and vision are the prominent two sensory modalities that work together to provide humans with a best possible perceptual understanding of the environment. Yet, when designing artificial binaural systems, this collaboration is often not honored. Instead, substantial effort is made to construct best performing purely auditory-scene-analysis systems, sometimes with goals and ambitions that reach beyond human capabilities. It is often not considered that, what enables us to perform so well in complex environments, is the ability of: (i) using more than one source of information, for instance, visual in addition to auditory one and, (ii) making assumptions about the objects to be perceived on the basis of a priori knowledge. In fact, the human capability of inferring information from one modality to another one helps substantially to efficiently analyze the complex environments that humans face everyday. Along this line of thinking, this chapter addresses the effects of attention reorientation triggered by audition. Accordingly, it discusses mechanisms that lead to appropriate motor reactions, such as head movements for putting our visual sensors toward an audiovisual object of interest. After presenting some of the neuronal foundations of multimodal integration and motor reactions linked to auditory-visual perception, some ideas and issues from the field of a robotics are tackled. This is accomplished by referring to computational modeling. Thereby some biological bases are discussed as underlie active multimodal perception, and it is demonstrated how these can be taken into account when designing artificial agents endowed with human-like perception.
This work has been supported by the European FP7 TWO!EARS project, ICT-618075, www.twoears.eu. We also thank two anonymous reviewers for their previous comments on this work.
- Atilgan, H., S.M. Town, K.C. Wood, G.P. Jones, R.K. Maddox, A.K. Lee, and J.K. Bizley. 2018. Integration of visual information in auditory cortex promotes auditory scene analysis through multisensory binding. Neuron 97 (3): 640–655.e4. https://doi.org/10.1016/j.neuron.2017.12.034.
- Baranes, A., and P.-Y. Oudeyer. 2009. R-IAC: Robust intrinsically motivated active learning. IEEE Transactions on Autonomous Mental Development 1 (3): 155–169.Google Scholar
- Baranes, A., and P.-Y. Oudeyer 2010. Intrinsically motivated goal exploration for active motor learning in robots: A case study. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IROS, IEEE, 1766–1773. https://doi.org/10.1109/IROS.2010.5651385.
- Berlyne, D.E. 1950. Novelty and curiosity as determinants of exploratory behavior. British Journal of Psychology 41 (1–2): 68–80.Google Scholar
- Berlyne, D.E. 1954. A theory of human curiosity. British Journal of Psychology 45 (3): 180–191.Google Scholar
- Blauert, J., and G. Brown. 2020. Reflexive and reflective auditory feedback. In The Technology of Binaural Understanding, eds. J. Blauert and J. Braasch, 3–31. Cham, Switzerland: Springer and APA Press.Google Scholar
- Cohen-L’hyver, B. 2017. Modulation of head movements for the multimodal analysis of an unknown environment. Ph.D. thesis, University Pierre and Marie Curie.Google Scholar
- Cohen-L’hyver, B., S. Argentieri, and B. Gas. 2015. Modulating the auditory turn-to reflex on the basis of multimodal feedback loops: The dynamic weighting model. In IEEE International Conference on Robotics and Biomimetics (ROBIO), 1109–1114.Google Scholar
- Cohen-L’hyver, B., S. Argentieri, and B. Gas. 2016. Multimodal fusion and inference using binaural audition and vision. In International Congress on Acoustics.Google Scholar
- Cohen-L’hyver, B., S. Argentieri, and B. Gas. 2018. The head turning modulation system: An active multimodal paradigm for intrinsically motivated exploration of unknown environments. Frontiers in Neurorobotics 12: 60. https://doi.org/10.3389/fnbot.2018.00060.
- Corbetta, M., G. Patel, and G.L. Shulman. 2008. Review the reorienting system of the human brain: From environment to theory of mind. 306–324. https://doi.org/10.1016/j.neuron.2008.04.017.
- Cuperlier, N., M. Quoy, and P. Gaussier. 2007. Neurobiologically inspired mobile robot navigation and planning. Frontiers in Neurorobotics 1. https://doi.org/10.3389/neuro.12.
- Duangudom, V., and D.V. Anderson 2007. Using auditory saliency to understand complex auditory scenes. In 15th European Signal Processing Conference.Google Scholar
- Durrant-Whyte, H., and T. Bailey. 2006. Simultaneous localization and mapping (SLAM): Part I. IEEE Robotics Automation Magazine 13 (2): 99–110.Google Scholar
- Gebhard, J., and G. Mowbray. 1959. On discriminating the rate of visual flicker and auditory flutter. The American Journal of Psychology 72 (4): 521–529.Google Scholar
- Girard, B., V. Cuzin, A. Guillot, K.N. Gurney, and T.J. Prescott. 2002. Comparing a brain-inspired robot action selection mechanism with ‘Winner-Takes-All’. In From Animals to Animats 7: Proceedings of the 7th International Conference on Simulation of Adaptive Behavior, vol. 7, 75, MIT Press.Google Scholar
- Gurney, K., T.J. Prescott, and P. Redgrave. 2001a. A computational model of action selection in the basal ganglia. I. A new functional anatomy. Biological Cybernetics 84 (6): 401–410.Google Scholar
- Gurney, K., T.J. Prescott, and P. Redgrave. 2001b. A computational model of action selection in the basal ganglia. II. Analysis and simulation of behaviour. Biological Cybernetics 84 (6): 411–423.Google Scholar
- Hay, J.C., H.L. Pick, and K. Ikeda. 1965. Visual capture produced by prism spectacles. Psychonomic Science 2 (1–12): 215–216.Google Scholar
- Hochstein, S., and M. Ahissar. 2002. View from the top: Hierarchies and reverse hierarchies review. Neuron 36 (3): 791–804.Google Scholar
- Koch, C., and S. Ullman. 1985. Shifts in selective visual attention: Towards the underlying neural circuitry. Human Neurobiology 4 (4): 219–227.Google Scholar
- Li, Z. 2002. A saliency map in primary visual cortex. Trends in Cognitive Sciences 6 (1): 9–16.Google Scholar
- Macedo, L., and A. Cardoso. 2001. Modeling forms of surprise in an artificial agent. In Proceedings of the Cognitive Science Society, vol. 23.Google Scholar
- Makarenko, A.A., S.B. Williams, F. Bourgault, and H.F. Durrant-Whyte. 2002. An experiment in integrated exploration. In IEEE International Conference on Robots and Systems.Google Scholar
- May, P.J. 2006. The mammalian superior colliculus: Laminar structure and connections. Progress in Brain Research 321–378. https://doi.org/10.1016/S0079-6123(05)51011-2.
- Mazer, J.A., and J.L. Gallant. 2003. Goal-related activity in v4 during free viewing visual search: Evidence for a ventral stream visual salience map. Neuron 40: 1241–1250.Google Scholar
- Meredith, M.A., and B.E. Stein. 1986. Visual, auditory, and somatosensory convergence on cells in superior colliculus results in multisensory integration. Journal of Neurophysiology 56 (3): 640–662. http://dx.doi.org/citeulike-article-id:844215.
- Moschovakis, A.K. 1996. The superior colliculus and eye movement control. Current Opinion in Neurobiology 6 (6): 811–816.Google Scholar
- Näätänen, R., and K. Alho. 1995. Generators of electrical and magnetic mismatch responses in humans. Brain Topography 7 (4): 315–320.Google Scholar
- Näätänen, R., A. Gaillard, and S. Mäntysalo. 1978. Early selective-attention effect on evoked potential reinterpreted. Acta Psychologica 42: 313–329.Google Scholar
- Näätänen, R., P. Paavilainen, T. Rinne, and K. Alho. 2007. The mismatch negativity (MMN) in basic research of central auditory processing: A review. Clinical Neurophysiology: Official Journal of the International Federation of Clinical Neurophysiology 118 (12): 2544–2590. https://doi.org/10.1016/j.clinph.2007.04.026.CrossRefGoogle Scholar
- Nelken, I., and M. Ahissar. 2006. High-level and low-level processing in the auditory system: The role of primary auditory cortex. Dynamic of Speech Production and Perception: 5–12.Google Scholar
- Nothdurft, H.-C. 2006. Salience and target selection in visual search. Visual Cognition 14 (4–8): 514–542.Google Scholar
- Oliva, A., A. Torralba, M.S. Castelhano, and J.M. Henderson. 2003. Top-down control of visual attention in object detection. In IEEE International Conference on Image Processing, vol. 1, 1–4, September 14–17. https://doi.org/10.1109/ICIP.2003.1246946.
- Pick, H.L., D.H. Warren, and J.C. Hay. 1969. Sensory conflict in judgments of spatial direction. Attention, Perception, & Psychophysics 6 (4): 203–205.Google Scholar
- Ruesch, J., M. Lopes, A. Bernardino, J. Hörnstein, J. Santos-Victor, and R. Pfeifer. 2008. Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub. In Proceedings—IEEE International Conference on Robotics and Automation, 962–967. https://doi.org/10.1109/ROBOT.2008.4543329.
- Saldana, H.M., and L.D. Rosenblum. 1993. Visual influences on auditory pluck and bow judgments. 54 (3): 406–416.Google Scholar
- Scheier, C.R., R. Nijhawan, and S. Shimojo. 1999. Sound alters visual temporal resolution. Investigative Ophthalmology & Visual Science 40: S792–S792.Google Scholar
- Schymura, C., Kolossa D. 2020. Blackboard systems for modeling binaural understanding. In The Technology of Binaural Understanding, eds. J. Blauert and J. Braasch, 91–111. Cham, Switzerland: Springer and ASA Press.Google Scholar
- Shams, L., C.A.Y. Kamitani, S. Thompson, and S. Shimojo. 2001. Sound alters visual evoked potentials in humans. Cognitive Neuroscience and Neuropsychology 12 (17): 3849–3852.Google Scholar
- Shams, L., Y. Kamitani, and S. Shimojo. 2002. Visual illusion induced by sound. Cognitive Brain Research 14: 147–152.Google Scholar
- Spence, C.J., and J. Driver. 1994. Covert spatial orienting in audition: Exogenous and endogenous mechanisms. Journal of Experimental Psychology: Human Perception and Performance 20 (3): 555–574.Google Scholar
- Spence, C., and J. Driver. 1996. Audiovisual links in endogenous covert spatial attention. Journal of Experimental Psychology: Human Perception and Performance 22 (4): 1005–1030.Google Scholar
- Spence, C., and J. Driver. 1997a. Audiovisual links in exogenous covert spatial orienting. Perception & Psychophysics 59 (1): 1–22. https://doi.org/10.3758/BF03206843.
- Spence, C., and J. Driver. 1997b. On measuring selective attention to an expected sensory modality. Perception & Psychophysics 59 (3): 389–403. https://doi.org/10.3758/BF03211906.
- Stein, B.E., W. Jiang, and T.R. Stanford. 2004. Multisensory integration in single neurons of the midbrain. The Handbook of Multisensory Processes, vol. 15, 243–264.Google Scholar
- Thompson, K.G., and N.P. Bichot. 2005. A visual salience map in the primate frontal eye field. Progress in Brain Research 147: 251–262.Google Scholar
- Two!Ears, N. Ma, I. Trowitzsch, Y. Kashef, J. Mohr, K. Obermayer, C. Schymura, D. Kolossa, T. Walther, H. Wierstorf, T. May, G. Brown, B. Cohen-L’hyver, P. Danès, M. Devy, T. Forgue, A. Podlubne, and B. Vandeportaele. 2012. Report on evaluation of the Two!Ears expert system. Technical report.Google Scholar
- Welch, R.B., and D.H. Warren. 1980. Immediate perceptual response to intersensory discrepancy. Psychological Bulletin 88 (3): 638.Google Scholar
- Yost, W.A. 1992. Auditory perception and sound source determination. Current Directions in Psychological Science 1 (6): 179–184.Google Scholar