Skip to main content
Log in

On the mental representations originating during the interaction between language and vision

  • Review
  • Published:
Cognitive Processing Aims and scope Submit manuscript

Abstract

The interaction between vision and language processing is clearly of interest to both cognitive psychologists and psycholinguists. Recent research has begun to create understanding of the interaction between vision and language in terms of the representational issues involved. In this paper, we first review some of the theoretical and methodological issues in the current vision–language interaction debate. Later, we develop a model that attempts to account for effects of affordances and visual context on language-scene interaction as well as the role of sensorimotor simulation. The paper addresses theoretical issues related to the mental representations that arise when visual and linguistic systems interact.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Alaerts K, Swinnen SP, Wenderoth N (2009) Interaction of sound and sight during action perception: evidence for shared modality-dependent action representations. Neuropsychologia 47:2593–2599

    Article  PubMed  Google Scholar 

  • Allopenna P, Magnuson J, Tanenhaus M (1998) Tracking the time course of spoken-word recognition using eye movements: evidence for continuous mapping models. J Memory Lang 38:419–439

    Article  Google Scholar 

  • Altmann GTM, Kamide Y (1999) Incremental interpretation at verbs: restricting the domain of subsequent reference. Cognition 73:247–264

    Article  CAS  PubMed  Google Scholar 

  • Altmann GTM, Kamide Y (2007) The real-time mediation of visual attention by language and world knowledge: linking anticipatory (and other) eye movements to linguistic processing. J Memory Lang 57:502–518

    Article  Google Scholar 

  • Altmann GTM (2004) Language-mediated eye movements in the absence of a visual world: the ‘blank screen paradigm’. Cognition 93:79–87

    Article  Google Scholar 

  • Altmann GTM, Kamide Y (2004) Now you see it, now you don’t: mediating the mapping between language and the visual world. In: Henderson J, Ferreira F (eds) The integration of language, vision and action. Psychology Press, Newyork, pp 347–386

    Google Scholar 

  • Altmann GTM, Kamide Y (2009) Discourse-mediation of the mapping between language and the visual world. Cognition 111(1):55–71

    Article  PubMed  Google Scholar 

  • Altmann GTM, Mirkovic J (2009) Incrementality and prediction in human sentence processing. Cogn Sci 33(4):583–609

    Article  PubMed  Google Scholar 

  • Anderson SE, Spivey MJ (2009) The enactment of language: decades of interactions between linguistic and motor processes. Lang Cogn 1(1):87–111

    Article  Google Scholar 

  • Arai M, Van Gompel R, Scheepers C (2007) Priming distransitive structures in comprehension. Cogn Psychol 54(3):218–250

    Article  PubMed  Google Scholar 

  • Barsalou L (2008) Grounded cognition. Annu Rev Psychol 59:17–45

    Article  Google Scholar 

  • Barsalou LW (1999) Perceptual symbol systems. Behav Brain Sci 22:577–660

    CAS  PubMed  Google Scholar 

  • Barsalou LW, Simmons WK, Barbey AK, Wilson CD (2003) Grounding conceptual knowledge in modality-specific systems. Trends Cogn Sci 7(2):84–91

    Article  PubMed  Google Scholar 

  • Biederman I (1972) Perceiving real-world scenes. Science 177:77–80

    Article  CAS  PubMed  Google Scholar 

  • Biederman I (1981) On the semantics of a glance at a scene. In: Kubovy M, Pomerantz JR (eds) Perceptual organization. Lawrence Erlbaum, Hillsdale, New Jersey, pp 213–263

    Google Scholar 

  • Bonfiglioli C, Finocchiaro C, Gesierich B, Rositani F, Vescovi M (2009) A kinematic approach to the conceptual representations of this and that. Cognition 111(2):270–274

    Article  PubMed  Google Scholar 

  • Borghi AM, Glenberg AM, Kaschak MP (2004) Putting words in perspective. Mem Cogn 32(6):863–873

    Google Scholar 

  • Buswell GT (1935) How people look at pictures. University of Chicago Press, Chicago

    Google Scholar 

  • Casile A, Giese MA (2006) Nonvisual motor training influences biological motion perception. Curr Biol 16:69–74

    Article  CAS  PubMed  Google Scholar 

  • Castelhano MS, Mack ML, Henderson JM (2009) Viewing task influences eye movement control during active scene perception. J Vision 9(3):1–15

    Article  Google Scholar 

  • Castiello U (1999) Mechanisms of selection for the control of hand action. Trends Cogn Sci 3(7):264–271

    Article  PubMed  Google Scholar 

  • Chambers CG, Tanenhaus MK, Magnuson JS (2004) Actions and affordances in syntactic ambiguity resolution. J Exp Psychol Learn Mem Cogn 30(3):687–696

    Article  PubMed  Google Scholar 

  • Chambers CG, Tanenhaus MK, Eberhard KM, Filip H, Carlson GN (2002) Circumscribing referential domains during real time language comprehension. J Mem Lang 47(1):30–49

    Article  Google Scholar 

  • Cooper RM (1974) The control of eye fixation by the meaning of spoken language: a new methodology for the real-time investigation of speech perception, memory, and language processing. Cogn Psychol 6(1):84–107

    Article  Google Scholar 

  • Crocker MW, Knoeferle P, Mayberry MR (2010) Situated sentence processing: the coordinated interplay account and a neurobehavioral model. Brain Lang 112(3):189–201

    Article  PubMed  Google Scholar 

  • Dahan D, Swingley D, Tanenhaus M, Magnuson JS (2000) Linguistic gender and spoken word recognition in French. J Mem Lang 42:465–480

    Article  Google Scholar 

  • De Graef P (2005) Gaze control in scenes. In: Underwood G (ed) Cognitive processes in eye guidance. Oxford University Press, Oxford, pp 189–211

    Google Scholar 

  • De Graef P, Lauwereyns J, Verfaillie K (2000) Attentional orienting and scene semantics (Psyc.Rep.No.268). University of Leuven, Laboratory of Experimental Psychology, Leuven

  • Eberhard KM, Spivey-Knowlton MJ, Sedivy JC, Tanenhaus MK (1995) Eye movements as a window into real-time language comprehension in natural context. J Psycholinguist Res 24(6):409–436

    Article  CAS  PubMed  Google Scholar 

  • Farmer TA, Anderson SE, Spivey M (2007) Gradiency and visual context in syntactic garden-paths. J Mem Lang 57(4):570–595

    Article  PubMed  Google Scholar 

  • Fei-Fei L, Iyer A, Koch C, Perona P (2007) What do we perceive in a glance of a real world scene? J Vision 7(1):1–29

    Article  Google Scholar 

  • Ferreira F, Henderson JM (2007) Introduction to the special issue of language-vision interaction. J Mem Lang 57:455–459

    Article  Google Scholar 

  • Findlay JM, Gilchrist ID (2001) Visual attention: the active vision perspective. In: Jenkins M, Harris L (eds) Vision and attention. Springer, New York, pp 83–103

    Google Scholar 

  • Finkbeiner M, Song JH, Nakayama K, Caramazza A (2008) Engaging the motor system with masked orthographic primes: a kinematic analysis. Vis Cogn 16(1):11–22

    Article  Google Scholar 

  • Gibson JJ (1977) The theory of affordances. In: Shaw RE, Bransford J (eds) Perceiving, acting, and knowing. Towards an ecological psychology. Lawrence Erlbaum Associates, Hillsdale, NJ, pp 67–82

    Google Scholar 

  • Gibson JJ (1979) The ecological approach to visual perception. Lawrence Erlbaum, Hillsdale, NJ

    Google Scholar 

  • Givón T (2002) The visual information-processing system as an evolutionary precursor of human language. In: Givón T, Malle BF (eds) The evolution of language out prelanguage. John Benjamins, Amsterdam, pp 3–50

    Google Scholar 

  • Glenberg AM (1997) What memory is for? Behav Brain Sci 20:1–55

    CAS  PubMed  Google Scholar 

  • Glenberg AM, Robertson DA (1999) Indexical understanding of instructions. Discourse Processes 28:1–26

    Article  Google Scholar 

  • Glenberg AM, Becker R, Koltzer S, Kolanko L, Muller S, Rinke M (2009) Episodic affordances contribute to language comprehension. Lang Cogn 1(1):113–135

    Article  Google Scholar 

  • Green MR, Oliva A (2009) Recognition of natural scenes from global properties: seeing the forest without representing the trees. Cogn Psychol 58:137–176

    Article  Google Scholar 

  • Greenwald AG (1970) Sensory feedback mechanisms in performance control: with special reference to the ideo-motor mechanism. Psychol Rev 77(2):73–99

    Article  CAS  PubMed  Google Scholar 

  • Hauk O, Johnsrude I, Pulvermüller F (2004) Somatotopic representation of action words in human motor and premotor cortex. Neuron 41:301–307

    Article  CAS  PubMed  Google Scholar 

  • Henderson JM (2003) Human gaze control during real-world scene perception. Trends Cogn Sci 7(11):498–504

    Article  PubMed  Google Scholar 

  • Henderson JM (2007) Regarding scenes. Curr Dir Psychol Sci 16:411–430

    Article  Google Scholar 

  • Henderson JM, Ferreira F (2004) Scene perception for psycholinguists. In: Henderson JM, Ferreira F (eds) The interface of language, vision and action. Psychology Press, Hove, pp 1–58

    Google Scholar 

  • Henderson JM, Hollingworth A (1999) High-level scene perception. Annu Rev Psychol 50:243–271

    Article  CAS  PubMed  Google Scholar 

  • Henderson JM, Pierce GL (2008) Eye movements during scene viewing: evidence for mixed control of fixation durations. Psychon Bull Rev 15:566–573

    Article  PubMed  Google Scholar 

  • Huettig F, Altmann GTM (2007) Visual-shape computation during language mediated attention is based on lexical input and not modulated by contextual appropriateness. Vis Cogn 15(8):985–1018

    Article  Google Scholar 

  • Huettig F, Quinlan PT, McDonald SA, Altmann GTM (2006) Models of high dimensional semantic space predicts language-mediated eye movements in the visual world. Acta Psychol 121:65–80

    Article  Google Scholar 

  • Itti L, Koch C (2001) Computational modelling of visual attention. Nat Rev Neurosci 2(3):194–203

    Article  CAS  PubMed  Google Scholar 

  • Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 2(11):1254–1259

    Article  Google Scholar 

  • Jackendoff R (2007) A parallel architecture perspective on language processing. Brain Res 1146:2–22

    Article  CAS  PubMed  Google Scholar 

  • Kaiser E, Runner JT, Sussman RS, Tanenhaus MK (2009) Structural and semantic constraints on the resolution of pronouns and reflexives. Cognition 112(1):55–80

    Article  PubMed  Google Scholar 

  • Kamide Y (2008) Anticipatory processes in sentence processing. Lang Linguist Compass 2(4):647–670

    Article  Google Scholar 

  • Kamide Y, Altmann GTM, Haywood SL (2003) Prediction and thematic information in incremental sentence processing: evidence from anticipatory eye movements. J Mem Lang 49:133–156

    Article  Google Scholar 

  • Kapoula Z, Daunys G, Herbez O, Yang Q (2009) Effect of title on eye-movement exploration of cubist paintings by Fernand Léger. Perception 38(4):479–491

    Article  PubMed  Google Scholar 

  • Kaschak MP, Glenberg AM (2000) Constructing meaning: the role of affordances and grammatical constructions in sentence comprehension. J Mem Lang 43:508–529

    Article  Google Scholar 

  • Kaschak MP, Madden CJ, Therriault DJ, Yaxley RH, Aveyard M, Blanchard AA, Zwaan RA (2005) Perception of motion affects language processing. Cognition 94(3):B79–B89

    Article  PubMed  Google Scholar 

  • Knoeferle P, Crocker M (2007) The influence of recent scene events on spoken language comprehension: evidence from eye movements. J Mem Lang 57(4):519–543

    Article  Google Scholar 

  • Knoeferle P, Crocker MW (2006) The coordinated interplay of scene, utterance, and world knowledge: evidence from eye tracking. Cogn Sci 30:481–529

    Article  Google Scholar 

  • Knoeferle P, Crocker MW, Scheepers C, Pickering MJ (2005) The influence of the immediate visual context on incremental thematic role-assignment: evidence from eye-movements in depicted events. Cognition 95:95–127

    Article  PubMed  Google Scholar 

  • Loftus GR, Mackworth NH (1978) Cognitive determinants of fixation location during picture viewing. J Exp Psychol Hum Percept Perform 4(4):565–572

    Article  CAS  PubMed  Google Scholar 

  • Mackworth NH, Morandi AJ (1967) The gaze selects information details within pictures. Percept Psychophys 2(11):547–551

    Google Scholar 

  • Marmolejo-Ramos F (2007) Nuevos avances en el estudio científico de la comprensión de textos. Universitas Psychologica 6(2):331–343

    Google Scholar 

  • Marmolejo-Ramos F, Elosúa de Juan MR, Gygax P, Madden C, Mosquera S (2009) Reading between the lines: the activation of embodied background knowledge during text comprehension. Pragmat Cogn 17(1):77–107

    Google Scholar 

  • Marmolejo-Ramos F, Ibañez A, Mishra R, Yokoyama S (2010) Towards a neurocognitive architecture of language, vision, and emotion comprehension. Manuscript in preparation

  • Masson MEJ, Bub DN, Warren CM (2008) Kicking calculators: contribution of embodied representations to sentence comprehension. J Mem Lang 59:256–265

    Article  Google Scholar 

  • Mayberry MR, Crocker MW, Knoeferle P (2009) Learning to attend: a connectionist model of situated language comprehension. Cogn Sci 33:449–496

    Article  Google Scholar 

  • Mishra RK (2009) Interaction of language and visual attention: evidence from production and comprehension. Prog Brain Res 176:277–292

    PubMed  Google Scholar 

  • O’Regan K, Noe A (2001) A sensorimotor account of vision and visual consciousness. Behav Brain Sci 25:5

    Google Scholar 

  • Papafragou A, Hulbert J, Trueswell J (2008) Does language guide event perception? Evidence from eye movements. Cognition 108:155–184

    Article  PubMed  Google Scholar 

  • Parkhurst DJ, Neibur E (2003) Scene content selected by active vision. Spat Vis 16(2):125–154

    Article  PubMed  Google Scholar 

  • Purves D, Brannon EM, Cabeza R, Huettel SA, LaBar KS, Platt ML, Woldorff MG (2008) Principles of cognitive neuroscience. Sinauer, Sunderland, MA

    Google Scholar 

  • Rayner K (1998) Eye movements in reading and information processing: 20 years of research. Psychol Bull 124:372–422

    Article  CAS  PubMed  Google Scholar 

  • Rayner K (2009) Eye movements and attention in reading, scene perception, and visual search. Q J Exp Psychol 62(8):1457–1506

    Article  Google Scholar 

  • Rayner K, Li X, Williams CC, Cave KR, Well AD (2007) Eye movements during information processing tasks: individual differences and cultural effects. Vis Res 47:2714–2726

    Article  PubMed  Google Scholar 

  • Rayner K, Smith TJ, Malcolm GL, Henderson JM (2008) Eye movements and visual encoding during scene perception. Psychol Sci 20(1):6–10

    Article  PubMed  Google Scholar 

  • Richardson DC, Matlock T (2007) The integration of figurative language and static depictions: an eye movement study of fictive motion. Cognition 102:129–138

    Article  PubMed  Google Scholar 

  • Roy D (2005) Grounding words in perception and action: computational insights. Trends Cogn Sci 9(8):389–396

    Article  PubMed  Google Scholar 

  • Schütz-Bosbach S, Prinz W (2007) Perceptual resonance: action-induced modulation of perception. Trends Cogn Sci 11(8):349–355

    Article  PubMed  Google Scholar 

  • Sedivy JE, Tanenhaus MK, Chambers CG, Carlson GN (1999) Achieving incremental interpretation through contextual representation: evidence from the processing of adjectives. Cognition 71:109–147

    Article  CAS  PubMed  Google Scholar 

  • Shimada S (2010) Deactivation in the sensorimotor area during observation of a human agent performing robotic actions. Brain Cogn 72:394–399

    Article  PubMed  Google Scholar 

  • Singh N, Mishra RK (in press) Simulating motion in figurative language comprehension. Open Neuroimaging J

  • Spivey MJ (2007) The continuity of mind. Oxford University Press, New York

    Google Scholar 

  • Spivey M, Geng JJ (2001) Oculomotor mechanisms activated by imagery and memory: eye movements to absent objects. Psychol Res 65(4):235–241

    Article  CAS  PubMed  Google Scholar 

  • Spivey MJ, Tanenhaus MK, Eberhard KM, Sedivy JC (2002) Eye movements and spoken language comprehension: effects of visual context on syntactic ambiguity resolution. Cogn Psychol 45(4):447–481

    Article  PubMed  Google Scholar 

  • Talmy L (2000) Towards a cognitive semantics. MIT Press, Cambridge, MA

    Google Scholar 

  • Tan LH, Chan AHD, Kay P, Khong PL, Yip LK, Luke KK (2008) Language affects patterns of brain activation associated with perceptual decision. PNAS 105(10):4004–4009

    Article  CAS  PubMed  Google Scholar 

  • Tanenhaus M, Spivey-Knowlton M, Eberhard K, Sedivy J (1995) Integration of visual and linguistic information in spoken-language comprehension. Science 268:1632–1634

    Article  CAS  PubMed  Google Scholar 

  • Torralba A, Oliva A, Castelhano MS, Henderson JM (2006) Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev 113:766–786

    Article  PubMed  Google Scholar 

  • Underwood G (ed) (2005) Cognitive guidance in eye movements. Oxford University Press, Oxford

    Google Scholar 

  • Underwood G, Crundall D, Hodson K (2005) Confirming statements about pictures of natural scenes. Perception 32:1069–1084

    Article  Google Scholar 

  • Vingerhoets G, Vandamme K, Vercammen A (2009) Conceptual and physical object qualities contribute differently to motor affordances. Brain Cogn 69:481–489

    Article  CAS  PubMed  Google Scholar 

  • Westerman G, Mareschal D, Johnson MH, Sirois S, Spratling MW, Thomas MSC (2007) Neuroconstructivism. Dev Sci 10(1):75–83

    Article  Google Scholar 

  • Westermann G, Sirios S, Shultz TR, Mareschal D (2006) Modeling developmental cognitive neuroscience. Trends Cogn Sci 10(5):227–232

    Article  PubMed  Google Scholar 

  • Yarbus AR (1967) Eye movements and vision. Plenum Press, New York

    Google Scholar 

  • Yee E, Sedivy J (2006) Eye movements to pictures reveal transient semantic activation during spoken word recognition. J Exp Psychol Learn Mem Cogn 32(1):1–14

    Article  PubMed  Google Scholar 

  • Zelinsky GJ (2008) A theory of eye movements during target acquisition. Psychol Rev 115:787–835

    Article  PubMed  Google Scholar 

  • Zwann R, Taylor LJ (2006) Seeing, acting and understanding: motor resonance in language comprehension. J Exp Psychol Gen 135(1):1–11

    Article  Google Scholar 

Download references

Acknowledgments

The authors thank Keith Rayner, Rick Dale, and Falk Huettig for comments on the content of the manuscript. We also thank Diana Pham and Rosalyn Shute for their comments on the structure of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ramesh Kumar Mishra.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mishra, R.K., Marmolejo-Ramos, F. On the mental representations originating during the interaction between language and vision. Cogn Process 11, 295–305 (2010). https://doi.org/10.1007/s10339-010-0363-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10339-010-0363-y

Keywords

Navigation