Abstract
Thanks to their different senses, human observers acquire multiple information coming from their environment. Complex cross-modal interactions occur during this perceptual process. This article proposes a framework to analyze and model these interactions through a rigorous and systematic data-driven process. This requires considering the general relationships between the physical events or factors involved in the process, not only in quantitative terms, but also in term of the influence of one factor on another. We use tools from information theory and probabilistic reasoning to derive relationships between the random variables of interest, where the central notion is that of conditional independence. Using mutual information analysis to guide the model elicitation process, a probabilistic causal model encoded as a Bayesian network is obtained. We exemplify the method by using data collected in an audio-visual localization task for human subjects, and we show that it yields a well-motivated model with good predictive ability. The model elicitation process offers new prospects for the investigation of the cognitive mechanisms of multisensory perception.
Similar content being viewed by others
References
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6): 716–723
Alais D, Burr D (2004) The ventriloquist effect results from near-optimal bimodal integration. Curr Biol 14(3): 257–262
Anastasio TJ, Patton PE (2003) A two-stage unsupervised learning algorithm reproduces multisensory enhancement in a neural network model of the corticotectal system. J Neurosci 23(17): 6713–6727
Andersen TS, Tiippana K, Sams M (2004) Factors influencing audiovisual fission and fusion illusions. Cogn Brain Res 21: 301–308
Andersen TS, Tiippana K, Mikko S (2005) Maximum likelihood integration of rapid flashes and beeps. Neurosci Lett 380: 155–160
Andersson SA, Madigan D, Perlman MD (1997) A characterization of Markov equivalence classes for acyclic digraphs. Ann Stat 25(2): 505–541
Battaglia PW, Jacobs RA, Aslin RN (2003) Bayesian integration of visual and auditory signals for spatial localization. J Opt Soc Am A 20(7): 1391–1397
Burnham KP, Anderson DR (2002a) Model selection and multimodel inference: a practical information-theoretic approach, Chap 2.4, 2nd edn. Springer, New York, pp 66–67
Burnham KP, Anderson DR (2002b) Model selection and multimodel inference: a practical information-theoretic approach, Chap 2.6, 2nd edn. Springer, New York, pp 70–72
Cover TM, Thomas JA (1991) Elements of information theory. Wiley, New York, USA
Deneve S, Pouget A (2004) Bayesian multisensory integration and cross-modal spatial links. J Physiol Paris 98(1-3): 249–258
Druzdzel MJ, Glymour C (1995) What do college ranking data tell us about student retention: causal discovery in action. In: Proceedings of 4th workshop on intelligent information systems. IPI PAN Press, Augustow, Poland, pp 1–10
Ernst MO (2006) A Bayesian view on multimodal cue integration. In: Human body perception from the inside out, Chap 6. Oxford University Press, New York, pp 105–131
Ernst MO, Banks MS (2002) Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415(6870): 429–433
Ernst MO, Bülthoff HH (2004) Merging the senses into a robust percept. TRENDS Cogn Sci 8(4): 162–169
He K, Meeden G (1997) Selecting the number of bins in a histogram: a decision theoretic approach. J Stat Plan Inference 61(1): 49–59
Heron J, Whitake D, McGraw PV (2004) Sensory uncertainty governs the extent of audio-visual interaction. Vis Res 44: 2875–2884
Hospedales TM, Vijayakumar S (2008) Structure inference for bayesian multisensory scene understanding. IEEE Trans Pattern Anal Mach Intell (PAMI) 30(12): 2140–2157
Knill, DC, Richards, W (eds) (1996) Perception as Bayesian inference. Cambridge University Press, New York
Kontkanen P, Myllymäki P, Silander T, Tirri H, Grünwald P (1998) Bayesian and information-theoretic priors for bayesian network parameters. In: Proceedings on 10th European conference on machine learning, vol 1398. Springer-Verlag, Chemnitz, Germany, pp 89–94
Körding KP, Beierholm U, Ma WJ, Quartz S, Tenenbaum JB, Shams L (2007) Causal inference in multisensory perception. PLoS ONE 2(9): e943
Lauritzen SL (1996) Graphical models. Oxford University Press, New York,USA
Lewald J, Guski R (2003) Cross-modal perceptual integration of spatially and temporally disparate auditory and visual stimuli. Cogn Brain Res 16: 468–478
MacKay DJ (2003) Information theory, inference, and learning algorithms. Cambridge University Press, New York, USA
Margaritis D (2003) Learning bayesian network model structure from data. Phd thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
Murphy KP (2002) Dynamic bayesian networks: representation, inference and learning. Phd thesis, University of California, Berkeley, USA
Neapolitan RE (2004) Learning Bayesian networks. Prentice Hall, Upper Saddle River, NJ
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan kaufmann, San Francisco, CA
Richiardi J (2007) Probabilistic models for multi-classifier biometric authentication using quality measures. These no. 3954, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Roach NW, Heron J, McGraw PV (2006) Resolving multisensory conflict: a strategy for balancing the costs and benefits of audio-visual integration. In: Proceedings of the Royal Society B: biological sciences, vol 273. Royal Society of London, London, UK, pp 2159–2168
Robins JM, Wasserman L (1999) On the impossibility of inferring causation from association without background knowledge. In: Computation, causation, and discovery. MIT Press, Cambridge, MA, USA, pp 305–321
Sato Y, Toyoizumi T (2007) Bayesian inference explains perception of unity and ventriloquism aftereffect: identification of common sources of audiovisual stimuli. Neural Comput 19: 3335–3355
Scott DW, Sain SR (2005) Multi-dimensional density estimation. In: Data mining and computational statistics, Handbook of statistics, Chap 9, vol 23. Elsevier, Amsterdam, pp 229–262
Shams L, Ma WJ, Beierholm U (2005) Sound-induced flash illusion as an optimal percept. NeuroReport 16(17): 1923–1927
Sheskin DJ (2004) Handbook of parametric and nonparametric statistical procedures, 3rd edn. CRC press, New York, NY, USA
Spence C (2007) Audiovisual multisensory integration. Acoust Sci Technol 28(2): 61–70
Spirtes P, Glymour C, Scheines R (2001) Causation, prediction, and search. MIT Press, Cambridge, MA, USA
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3: 583–617
Theodoridis S, Koutroumbas K (2006) Pattern recognition. Academic Press, Orlando, FL, USA
Verma T, Pearl J (1992) An algorithm for deciding if a set of observed independencies has a causal explanation. In: Proceedings of 8th annual conference on uncertainty in artificial intelligence (UAI-92). Morgan Kaufmann, San Mateo, CA, pp 323–333
Warren DH (1979) Spatial localization under conflict conditions: is there a single explanation?. Perception 8(3): 323–337
Welch RB, Warren DH (1980) Immediate perceptual response to intersensory discrepancy. Psychol Bull 88(3): 638–667
Wozny DR, Beierholm UR, Shams L (2008) Human trimodal perception follows optimal statistical inference. J Vis 8(3): 1–11
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Besson, P., Richiardi, J., Bourdin, C. et al. Bayesian networks and information theory for audio-visual perception modeling. Biol Cybern 103, 213–226 (2010). https://doi.org/10.1007/s00422-010-0392-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00422-010-0392-8