Abstract
The interaction between vision and language processing is clearly of interest to both cognitive psychologists and psycholinguists. Recent research has begun to create understanding of the interaction between vision and language in terms of the representational issues involved. In this paper, we first review some of the theoretical and methodological issues in the current vision–language interaction debate. Later, we develop a model that attempts to account for effects of affordances and visual context on language-scene interaction as well as the role of sensorimotor simulation. The paper addresses theoretical issues related to the mental representations that arise when visual and linguistic systems interact.
Similar content being viewed by others
References
Alaerts K, Swinnen SP, Wenderoth N (2009) Interaction of sound and sight during action perception: evidence for shared modality-dependent action representations. Neuropsychologia 47:2593–2599
Allopenna P, Magnuson J, Tanenhaus M (1998) Tracking the time course of spoken-word recognition using eye movements: evidence for continuous mapping models. J Memory Lang 38:419–439
Altmann GTM, Kamide Y (1999) Incremental interpretation at verbs: restricting the domain of subsequent reference. Cognition 73:247–264
Altmann GTM, Kamide Y (2007) The real-time mediation of visual attention by language and world knowledge: linking anticipatory (and other) eye movements to linguistic processing. J Memory Lang 57:502–518
Altmann GTM (2004) Language-mediated eye movements in the absence of a visual world: the ‘blank screen paradigm’. Cognition 93:79–87
Altmann GTM, Kamide Y (2004) Now you see it, now you don’t: mediating the mapping between language and the visual world. In: Henderson J, Ferreira F (eds) The integration of language, vision and action. Psychology Press, Newyork, pp 347–386
Altmann GTM, Kamide Y (2009) Discourse-mediation of the mapping between language and the visual world. Cognition 111(1):55–71
Altmann GTM, Mirkovic J (2009) Incrementality and prediction in human sentence processing. Cogn Sci 33(4):583–609
Anderson SE, Spivey MJ (2009) The enactment of language: decades of interactions between linguistic and motor processes. Lang Cogn 1(1):87–111
Arai M, Van Gompel R, Scheepers C (2007) Priming distransitive structures in comprehension. Cogn Psychol 54(3):218–250
Barsalou L (2008) Grounded cognition. Annu Rev Psychol 59:17–45
Barsalou LW (1999) Perceptual symbol systems. Behav Brain Sci 22:577–660
Barsalou LW, Simmons WK, Barbey AK, Wilson CD (2003) Grounding conceptual knowledge in modality-specific systems. Trends Cogn Sci 7(2):84–91
Biederman I (1972) Perceiving real-world scenes. Science 177:77–80
Biederman I (1981) On the semantics of a glance at a scene. In: Kubovy M, Pomerantz JR (eds) Perceptual organization. Lawrence Erlbaum, Hillsdale, New Jersey, pp 213–263
Bonfiglioli C, Finocchiaro C, Gesierich B, Rositani F, Vescovi M (2009) A kinematic approach to the conceptual representations of this and that. Cognition 111(2):270–274
Borghi AM, Glenberg AM, Kaschak MP (2004) Putting words in perspective. Mem Cogn 32(6):863–873
Buswell GT (1935) How people look at pictures. University of Chicago Press, Chicago
Casile A, Giese MA (2006) Nonvisual motor training influences biological motion perception. Curr Biol 16:69–74
Castelhano MS, Mack ML, Henderson JM (2009) Viewing task influences eye movement control during active scene perception. J Vision 9(3):1–15
Castiello U (1999) Mechanisms of selection for the control of hand action. Trends Cogn Sci 3(7):264–271
Chambers CG, Tanenhaus MK, Magnuson JS (2004) Actions and affordances in syntactic ambiguity resolution. J Exp Psychol Learn Mem Cogn 30(3):687–696
Chambers CG, Tanenhaus MK, Eberhard KM, Filip H, Carlson GN (2002) Circumscribing referential domains during real time language comprehension. J Mem Lang 47(1):30–49
Cooper RM (1974) The control of eye fixation by the meaning of spoken language: a new methodology for the real-time investigation of speech perception, memory, and language processing. Cogn Psychol 6(1):84–107
Crocker MW, Knoeferle P, Mayberry MR (2010) Situated sentence processing: the coordinated interplay account and a neurobehavioral model. Brain Lang 112(3):189–201
Dahan D, Swingley D, Tanenhaus M, Magnuson JS (2000) Linguistic gender and spoken word recognition in French. J Mem Lang 42:465–480
De Graef P (2005) Gaze control in scenes. In: Underwood G (ed) Cognitive processes in eye guidance. Oxford University Press, Oxford, pp 189–211
De Graef P, Lauwereyns J, Verfaillie K (2000) Attentional orienting and scene semantics (Psyc.Rep.No.268). University of Leuven, Laboratory of Experimental Psychology, Leuven
Eberhard KM, Spivey-Knowlton MJ, Sedivy JC, Tanenhaus MK (1995) Eye movements as a window into real-time language comprehension in natural context. J Psycholinguist Res 24(6):409–436
Farmer TA, Anderson SE, Spivey M (2007) Gradiency and visual context in syntactic garden-paths. J Mem Lang 57(4):570–595
Fei-Fei L, Iyer A, Koch C, Perona P (2007) What do we perceive in a glance of a real world scene? J Vision 7(1):1–29
Ferreira F, Henderson JM (2007) Introduction to the special issue of language-vision interaction. J Mem Lang 57:455–459
Findlay JM, Gilchrist ID (2001) Visual attention: the active vision perspective. In: Jenkins M, Harris L (eds) Vision and attention. Springer, New York, pp 83–103
Finkbeiner M, Song JH, Nakayama K, Caramazza A (2008) Engaging the motor system with masked orthographic primes: a kinematic analysis. Vis Cogn 16(1):11–22
Gibson JJ (1977) The theory of affordances. In: Shaw RE, Bransford J (eds) Perceiving, acting, and knowing. Towards an ecological psychology. Lawrence Erlbaum Associates, Hillsdale, NJ, pp 67–82
Gibson JJ (1979) The ecological approach to visual perception. Lawrence Erlbaum, Hillsdale, NJ
Givón T (2002) The visual information-processing system as an evolutionary precursor of human language. In: Givón T, Malle BF (eds) The evolution of language out prelanguage. John Benjamins, Amsterdam, pp 3–50
Glenberg AM (1997) What memory is for? Behav Brain Sci 20:1–55
Glenberg AM, Robertson DA (1999) Indexical understanding of instructions. Discourse Processes 28:1–26
Glenberg AM, Becker R, Koltzer S, Kolanko L, Muller S, Rinke M (2009) Episodic affordances contribute to language comprehension. Lang Cogn 1(1):113–135
Green MR, Oliva A (2009) Recognition of natural scenes from global properties: seeing the forest without representing the trees. Cogn Psychol 58:137–176
Greenwald AG (1970) Sensory feedback mechanisms in performance control: with special reference to the ideo-motor mechanism. Psychol Rev 77(2):73–99
Hauk O, Johnsrude I, Pulvermüller F (2004) Somatotopic representation of action words in human motor and premotor cortex. Neuron 41:301–307
Henderson JM (2003) Human gaze control during real-world scene perception. Trends Cogn Sci 7(11):498–504
Henderson JM (2007) Regarding scenes. Curr Dir Psychol Sci 16:411–430
Henderson JM, Ferreira F (2004) Scene perception for psycholinguists. In: Henderson JM, Ferreira F (eds) The interface of language, vision and action. Psychology Press, Hove, pp 1–58
Henderson JM, Hollingworth A (1999) High-level scene perception. Annu Rev Psychol 50:243–271
Henderson JM, Pierce GL (2008) Eye movements during scene viewing: evidence for mixed control of fixation durations. Psychon Bull Rev 15:566–573
Huettig F, Altmann GTM (2007) Visual-shape computation during language mediated attention is based on lexical input and not modulated by contextual appropriateness. Vis Cogn 15(8):985–1018
Huettig F, Quinlan PT, McDonald SA, Altmann GTM (2006) Models of high dimensional semantic space predicts language-mediated eye movements in the visual world. Acta Psychol 121:65–80
Itti L, Koch C (2001) Computational modelling of visual attention. Nat Rev Neurosci 2(3):194–203
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 2(11):1254–1259
Jackendoff R (2007) A parallel architecture perspective on language processing. Brain Res 1146:2–22
Kaiser E, Runner JT, Sussman RS, Tanenhaus MK (2009) Structural and semantic constraints on the resolution of pronouns and reflexives. Cognition 112(1):55–80
Kamide Y (2008) Anticipatory processes in sentence processing. Lang Linguist Compass 2(4):647–670
Kamide Y, Altmann GTM, Haywood SL (2003) Prediction and thematic information in incremental sentence processing: evidence from anticipatory eye movements. J Mem Lang 49:133–156
Kapoula Z, Daunys G, Herbez O, Yang Q (2009) Effect of title on eye-movement exploration of cubist paintings by Fernand Léger. Perception 38(4):479–491
Kaschak MP, Glenberg AM (2000) Constructing meaning: the role of affordances and grammatical constructions in sentence comprehension. J Mem Lang 43:508–529
Kaschak MP, Madden CJ, Therriault DJ, Yaxley RH, Aveyard M, Blanchard AA, Zwaan RA (2005) Perception of motion affects language processing. Cognition 94(3):B79–B89
Knoeferle P, Crocker M (2007) The influence of recent scene events on spoken language comprehension: evidence from eye movements. J Mem Lang 57(4):519–543
Knoeferle P, Crocker MW (2006) The coordinated interplay of scene, utterance, and world knowledge: evidence from eye tracking. Cogn Sci 30:481–529
Knoeferle P, Crocker MW, Scheepers C, Pickering MJ (2005) The influence of the immediate visual context on incremental thematic role-assignment: evidence from eye-movements in depicted events. Cognition 95:95–127
Loftus GR, Mackworth NH (1978) Cognitive determinants of fixation location during picture viewing. J Exp Psychol Hum Percept Perform 4(4):565–572
Mackworth NH, Morandi AJ (1967) The gaze selects information details within pictures. Percept Psychophys 2(11):547–551
Marmolejo-Ramos F (2007) Nuevos avances en el estudio científico de la comprensión de textos. Universitas Psychologica 6(2):331–343
Marmolejo-Ramos F, Elosúa de Juan MR, Gygax P, Madden C, Mosquera S (2009) Reading between the lines: the activation of embodied background knowledge during text comprehension. Pragmat Cogn 17(1):77–107
Marmolejo-Ramos F, Ibañez A, Mishra R, Yokoyama S (2010) Towards a neurocognitive architecture of language, vision, and emotion comprehension. Manuscript in preparation
Masson MEJ, Bub DN, Warren CM (2008) Kicking calculators: contribution of embodied representations to sentence comprehension. J Mem Lang 59:256–265
Mayberry MR, Crocker MW, Knoeferle P (2009) Learning to attend: a connectionist model of situated language comprehension. Cogn Sci 33:449–496
Mishra RK (2009) Interaction of language and visual attention: evidence from production and comprehension. Prog Brain Res 176:277–292
O’Regan K, Noe A (2001) A sensorimotor account of vision and visual consciousness. Behav Brain Sci 25:5
Papafragou A, Hulbert J, Trueswell J (2008) Does language guide event perception? Evidence from eye movements. Cognition 108:155–184
Parkhurst DJ, Neibur E (2003) Scene content selected by active vision. Spat Vis 16(2):125–154
Purves D, Brannon EM, Cabeza R, Huettel SA, LaBar KS, Platt ML, Woldorff MG (2008) Principles of cognitive neuroscience. Sinauer, Sunderland, MA
Rayner K (1998) Eye movements in reading and information processing: 20 years of research. Psychol Bull 124:372–422
Rayner K (2009) Eye movements and attention in reading, scene perception, and visual search. Q J Exp Psychol 62(8):1457–1506
Rayner K, Li X, Williams CC, Cave KR, Well AD (2007) Eye movements during information processing tasks: individual differences and cultural effects. Vis Res 47:2714–2726
Rayner K, Smith TJ, Malcolm GL, Henderson JM (2008) Eye movements and visual encoding during scene perception. Psychol Sci 20(1):6–10
Richardson DC, Matlock T (2007) The integration of figurative language and static depictions: an eye movement study of fictive motion. Cognition 102:129–138
Roy D (2005) Grounding words in perception and action: computational insights. Trends Cogn Sci 9(8):389–396
Schütz-Bosbach S, Prinz W (2007) Perceptual resonance: action-induced modulation of perception. Trends Cogn Sci 11(8):349–355
Sedivy JE, Tanenhaus MK, Chambers CG, Carlson GN (1999) Achieving incremental interpretation through contextual representation: evidence from the processing of adjectives. Cognition 71:109–147
Shimada S (2010) Deactivation in the sensorimotor area during observation of a human agent performing robotic actions. Brain Cogn 72:394–399
Singh N, Mishra RK (in press) Simulating motion in figurative language comprehension. Open Neuroimaging J
Spivey MJ (2007) The continuity of mind. Oxford University Press, New York
Spivey M, Geng JJ (2001) Oculomotor mechanisms activated by imagery and memory: eye movements to absent objects. Psychol Res 65(4):235–241
Spivey MJ, Tanenhaus MK, Eberhard KM, Sedivy JC (2002) Eye movements and spoken language comprehension: effects of visual context on syntactic ambiguity resolution. Cogn Psychol 45(4):447–481
Talmy L (2000) Towards a cognitive semantics. MIT Press, Cambridge, MA
Tan LH, Chan AHD, Kay P, Khong PL, Yip LK, Luke KK (2008) Language affects patterns of brain activation associated with perceptual decision. PNAS 105(10):4004–4009
Tanenhaus M, Spivey-Knowlton M, Eberhard K, Sedivy J (1995) Integration of visual and linguistic information in spoken-language comprehension. Science 268:1632–1634
Torralba A, Oliva A, Castelhano MS, Henderson JM (2006) Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev 113:766–786
Underwood G (ed) (2005) Cognitive guidance in eye movements. Oxford University Press, Oxford
Underwood G, Crundall D, Hodson K (2005) Confirming statements about pictures of natural scenes. Perception 32:1069–1084
Vingerhoets G, Vandamme K, Vercammen A (2009) Conceptual and physical object qualities contribute differently to motor affordances. Brain Cogn 69:481–489
Westerman G, Mareschal D, Johnson MH, Sirois S, Spratling MW, Thomas MSC (2007) Neuroconstructivism. Dev Sci 10(1):75–83
Westermann G, Sirios S, Shultz TR, Mareschal D (2006) Modeling developmental cognitive neuroscience. Trends Cogn Sci 10(5):227–232
Yarbus AR (1967) Eye movements and vision. Plenum Press, New York
Yee E, Sedivy J (2006) Eye movements to pictures reveal transient semantic activation during spoken word recognition. J Exp Psychol Learn Mem Cogn 32(1):1–14
Zelinsky GJ (2008) A theory of eye movements during target acquisition. Psychol Rev 115:787–835
Zwann R, Taylor LJ (2006) Seeing, acting and understanding: motor resonance in language comprehension. J Exp Psychol Gen 135(1):1–11
Acknowledgments
The authors thank Keith Rayner, Rick Dale, and Falk Huettig for comments on the content of the manuscript. We also thank Diana Pham and Rosalyn Shute for their comments on the structure of this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mishra, R.K., Marmolejo-Ramos, F. On the mental representations originating during the interaction between language and vision. Cogn Process 11, 295–305 (2010). https://doi.org/10.1007/s10339-010-0363-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10339-010-0363-y