Abstract
Does action play any crucial role in our perception of pictures? The standard literature on picture perception has never explicitly tackled this question. This is for a simple reason. After all, objects in a picture seem to be static objects of perception. Thus, it might sound extremely controversial to say that action is crucial in picture perception. Contrary to this general intuitive stance, this paper defends, for the first time, the apparently very controversial claim, never addressed in the literature, that some of the specific and essential relations between vision and action make action (and its motoric basis) crucial in order for us to enter pictorial experience. I first discuss two ways in which vision and action are deeply linked, by describing the famous notions of Vision-for-Action and Sensorimotor Understanding. Then, I describe the special role they play in generating ordinary pictorial experience and suggest that, when we cannot rely on them while in front of a picture, we lose pictorial experience.
Similar content being viewed by others
Notes
By ‘normal visual experience’ here I simply mean the veridical visual experience of a real, i.e. non-pictorial, object, i.e. the visual experience we obtain during the visual perception of a real object.
One might argue that our best model of visual neuroscience, the ‘Two Visual Systems Model’ suggests that visual recognition and vision-for-action are separated processes, and the visual representations related to action are not conscious (Milner and Goodale 1995/2006). However, this was only the initial hypothesis. Recent evidence, and philosophical speculation thereof, suggests that they have anatomo-functional links that lead vision (even conscious vision) to be strictly related to action (for a philosophical analysis see Ferretti 2018, 2020b, 2021; forthcoming). These anatomo-functional links also operate in picture perception, concerning different tasks (Ferretti 2016c, 2018).
Of course, those who embrace this claim, also embrace that there can be some differences between the perceptual visual state we are in during ordinary perception and the one we are in during picture perception. Still, both of them genuinely are perceptual, visual states (for recent reviews see Nanay 2011a, 2015; Ferretti 2017a, 2018, 2020a; Ferretti and Marchi 2020).
Though the relations between vision and action can take different forms, these are the two main forms mentioned in the literature, which are taken to be the most important and general. Most of the other possible relations can be just seen as sub-classes of these two (Briscoe and Grush 2015; Nanay 2013, 2018; Ferretti and Zipoli Caiani 2019; Clark 2001, 2007, 2009).
I have said that not only the development of visual perception, but also its actual exercise does need action and movement. Especially concerning sensorimotor understanding, this, of course, does not mean that, if I am immobile, I cannot get any visual experience. Nonetheless, it has been clearly shown that the resources of action processing massively shape the way we perceive the external environment in everyday life (O’Regan and Noë 2001; Noë 2004; Nanay 2013; Briscoe and Grush 2015; Ferretti and Zipoli Caiani 2019; Bishop and Martin 2014). Movement constrains the way a particular dynamic biological subject develops its specific and peculiar kind of visual experience (Ibid.), as different movements related to different bodily structures can lead to different (more or less complex) visual experiences. Also, completely static conditions are very unnatural, concerning the development, as well as the usual status, of dynamic subjects as humans (Ibid.).
In this respect, though both widely endorsed in the literature, (1) is a much more complex notion than (2), which is taken to be, at the moment, a very intuitive and not controversial claim, and for which several arguments have been provided in the literature (Nanay 2012a; Ferretti 2018). For this reason, I focus on the specific way of spelling out (1) and do not need to explicitly discuss (2).
This means that, normally, an unconscious visual representation of the surface can modulate the conscious visual representation of the depicted object.
Recall, however, that my goal is simply to explain how action plays a crucial role in shaping these two visual representations, which are crucial for entering pictorial experience. So, while my account is in line with the best notion of simultaneity endorsed in the literature, I could also mention these visual representations in very general terms (independently of whether one wants to rigidly conceive them as conscious or unconscious), avoiding any problematic angle on the notion simultaneity.
In the words of the authors of the experimental study: “When a picture is viewed normally with both eyes, the picture’s surface is visible because of cues such as binocular disparity and the visible frame of the picture (…). Distance cues (…) specify the distance of this visible picture surface (…) rather than the pictorial contents (…). There are no known optical cues that specify the distance of pictorial objects from the observer. Therefore, under binocular viewing of pictures, although 3-D object shapes can be clearly perceived, their scale and absolute depth should remain optically unspecified” (Vishwanath and Hibbard 2013: 1682–1683; see also Vishwanath 2014: 159–160).
Again: “Monocular aperture viewing removes the main cues that specify the presence of the picture surface (…), as well as binocular cues specifying its distance (…). In the absence of visible picture surfaces, (…) the brain attributes the accommodation response to the pictorial objects, and assigns any associated distance information to them, allowing absolute depth values to be derived.”(Vishwanath and Hibbard 2013: 1683; see also Vishwanath 2014: 159–160).
A view that is not in conflict with previous accounts of picture perception (offered by Nanay 2011a, 2015 and Matthen 2005), but extends them in line with the current evidence available from vision science. In this respect, the general and neutral notion of ‘vision-for-action’ used here and the one of ‘motion-guiding-vision’ offered by Matthen (2005) seem to be perfectly compatible.
These inferences are directly related to the interpretation of the passage quoted in footnote 10.
As said (Sect. 5), the standard story on simultaneity goes against the possibility of simultaneous conscious visual states as both pertaining to the surface and the depicted object. However, those few who still want to maintain (against the most common view) that we visually represent, consciously, both the surface and the depicted object simultaneously (despite the numerous, philosophical and empirical problems with this claim, see Ferretti and Marchi 2020), can interpret my neutral sentence “we do not visually represent” as “we do not visually represent consciously”. Those who do not agree, and embrace the common stance on simultaneity, as the present paper does, could, in principle, interpret this sentence as suggesting that “we do not visually represent at all, i.e. either consciously or unconsciously” (on the discussion of the implications of this move in picture perception, see Ferretti 2020a). The case of trompe l’oeils however, as we shall see, will further suggest that the common stance is the most reliable. (Cfr. footnote 8).
Just a trivial conceptual point: of course, this is about situations in which one actually is in front of a picture. (Otherwise, this would literally imply, following a misunderstanding, that, when I don’t see a picture at all—in standard, everyday life—there is a depicted object that looks present for motor interaction).
These inferences are directly related to the interpretation of the passage quoted in footnote 9.
The idea that from the experimental evidence mentioned above we can obtain these inferences has been recently defended in the literature (Ferretti 2020a). Furthermore, we can build even more technical inferences, with more specific details on the behavior of the visual system, in relation to the numerous visual aspects of the ascription of depth and spatial properties, as well as concerning the distinction between conscious and unconscious vision, in these perceptual scenarios. The way these inferences are offered here is, however, sufficient for the point at stake in the paper. For a more detailed philosophical discussion see (Ferretti 2016c, 2018, 2017a, b, 2020a, c). Accordingly, note that (F), (F*), (FF) and (FF*) are not related in a deductive manner, but ground on inferences to the best explanation.
I suppose it is clear that this aspect of ordinary pictorial experience does not need to build on any argument here: it is visually evident (a phenomenal evidence) to every (human) observer (Matthen 2005; Nanay 2011a, 2015; Hopkins 2012; Aasen 2015; Pettersson 2011; Noë 2012; Ferretti 2016c, 2017b, 2018, 2019a, 2020a, c).
In line with footnote 17, this is also visually evident to every (human) observer (Ferretti 2016c, 2017b, 2018, 2019a, 2020a, c). And since trompe l’oeils deceive the viewer (they full the eye), most of the time, only momentarily, the explanation here concerns the moment in which the deception is at work.
It has been suggested that with trompe l’oeils we do not perceive the surface either consciously or unconsciously: we (our visual system) simply cannot track its presence (Ferretti 2020a). And this is the only explanation compatible with the current theories of pictorial simultaneity mentioned in (Sect. 5), in the light of the experimental results from visual neuroscience (Ferretti 2018, 2019a, 2020a, c). My account is perfectly compatible with this explanation (also in line with Footnotes 7, 8, and 13): when we can (at least) unconsciously visually track the presence of the surface, we can, accordingly, visually experience the pictorial object as such. When not (i.e. when the surface is not tracked at all), we fall into the trompe l’oeil effect. The same holds, as we shall see, for what it is said concerning sensorimotor understanding and trompe l’oeils.
This is a quick and efficient description of why they might be detected: “A picture viewed from its center of projection generates the same retinal image as the original scene, so the viewer perceives the scene correctly. When a picture is viewed from other locations, the retinal image specifies a different scene, but we normally do not notice the changes” (Vishwanath et al. 2005: 1401).
It is worth mentioning that there is another account that aims to explain lack of visual representation of spatial shifts related to the pictorial space. The explanation would be that the spectator does not focus on the distortions pertaining to the retinal image. This is due to the fact that monocular cues pertaining to the pictorial content are salient for her conscious visual representations. Accordingly, the visual information related to the surface is not relevant for the focus of her conscious visual representations attuned to the pictorial content. If we embrace this story, it is possible to suggest that there is no compensation at work here (cfr. the technical analysis offered by Briscoe 2018 (p. 70) of Koenderink et al. 2004: 515, 526). This might be considered, however, a less reliable view, but in accordance with the notion that the viewer does not need any awareness of the surface: there might be, indeed, just “some weak consensus” about this explanation (Koenderink et al. 2004: 526; see Nanay 2011a : 471, 2015: 187, 188 and Briscoe 2018: 70). In this respect, the account offered by Vishwanath et al. (2005) mentioned above is more recent than the one offered by Koenderink et al. Furthermore, it is more reliable in the light of several arguments, I cannot report here, offered by Vishwanath et al. (2005: 1404), also in relation to what suggested by Koenderink and van Doorn (2003). So, an account of spatial shifts built upon the notion of compensation is the most reliable (for a recent and complete review, see Ferretti 2020c)—and it is also in tune with the philosophical speculations on the Two Visual Systems Model (Nanay 2015), mentioned above. Consider also that the notion of ‘awareness’ and its relation to the one of ‘consciousness’ is not always clear in the literature on picture perception, and this may lead to linguistic misunderstandings concerning the way we use them in our explanations (see Ferretti 2020a, Ferretti and Marchi 2020).
We shall see below how, and to what extent, this is possible. This will be a crucial point for the claim defended in this section.
As said in footnote 23 concerning N*, we shall see below how, and to what extent, this is possible. It is also worth noting that the fact that these are the inferences that it is possible to draw from the account discussed above has been recently suggested in the literature. For a more detailed philosophical discussion see (Ferretti 2020c). As for the case of (F), (F*), (FF) and (FF*) (see footnote 16), the reader should note that (N), (N*), (NN) and (NN*) are not related in a deductive manner, but based on inferences to the best explanation.
In general, “the better the surface-slant information, the greater the invariance” (p. 1404). Indeed, in the study “When surface-slant information was limited, invariance was not observed” (Ibid).
And, as seen, in this case the pictorial object is not visually experienced as such, but as a present one suitable for motor interaction.
I cannot offer more details here (cfr. Footnote 19). For a complete and more technical review of the arguments in full details see (Ferretti 2020c).
This alleged illusory effect is, however, at the basis of the lack of spatial shifts with standard depicted scenes, and thus a crucial aspect of usual pictorial experience.
I want to thank two anonymous reviewers, whose crucial and insightful comments allowed me to improve the paper. Special thanks also go to these excellent scholars who have always proven to be ready to enthusiastically discuss with me about several issues concerning the functioning of the visual system in relation to picture perception. In particular, I want to thank Bence Nanay, Silvano Zipoli Caiani and Albert Newen for spending so much time discussing with me about these topics. Finally, previous versions of this work have been presented at different workshops, lectures and conferences. Special thanks go to the audience of the following events, in which I have presented my ideas. The San Raffaele Spring School of Philosophy on Perception and Aesthetic Experience, held at the San Raffaele University of Milan in May 2017, the Forth Conference of the European Network for the Philosophy of Language and Mind, held at the Ruhr-University Bochum in September 2017, the Sifa Conference of the Italian Society for Analytic Philosophy, held at the University of Eastern Piedmont, Novara in September 2018, the Depiction, Pictorial Experience and Vision Science Conference held at the University of Glasgow in November 2018, my lecture at the ITAB Institute for Advanced Biomedical Technologies, Department of Neuroscience, Imaging and Clinical Sciences, G. d’Annunzio University of Chieti in December 2018, the Workshop on Human Nature: Philosophy and the Natural Sciences held at the University of Milan in December 2019, my lecture at the Department of Philosophy of the University of Florence in January 2020 and at the Department of Philosophy of the University of Basel in February 2020, my NOMIS Lecture held at the Eikones, Center for the Theory and History of the Image at the University of Basel in October 2020. In particular, I would like to thank Vittorio Gallese, Alva Noë, Robert Briscoe, Jessie Prinz, Derek Brown, Michael Newall, John Kulvicki, Dhanraj Vishwanath, Nicholas Wade, Solveig Aasen, Bence Nanay, Paolo Spinicci, Alberto Voltolini, Giorgia Committeri, Marcello Costantini, Bence Nanay, Albert Newen, Colin Klein, Marco Viola, Andrea Borghini, Roberta Lanfredini, Marco Nathan, for offering precious comments on the ideas developed in this paper. A special thank also goes to those components of my group at Eikones, University of Basel, who have been discussing with me about these topics for the last months: Friederike Zenker, Ralph Ubl, Markus Wild, Markus Klammer, Jakub Stejskal, Seth Watter. Since the finalization of the work for the publication of the present article has been done during the transition from one research position to another, I would like to acknowledge support from both the corresponding fellowships. This work was supported by a NOMIS Fellowship, awarded by the Eikones—Center for the Theory and History of the Image at the University of Basel, Switzerland. This work was also supported by a German Humboldt Fellowship, hosted by Professor Albert Newen at the Institute for Philosophy II, Ruhr-University Bochum, Germany.
References
Aasen, S. (2015). Pictures, presence and visibility. Philosophical Studies. https://doi.org/10.1007/s11098-015-0475-4.
Bishop, J. M., & Martin, A. O. (Eds.). (2014). Contemporary sensorimotor theory, studies in applied philosophy. Berlin, New York: Springer.
Briscoe, R. (2016). Depiction, pictorial experience, and vision science. In C. Hill & B. McLaughlin (Eds.,) New directions in the philosophy of perception. Philosophical Topics 44(2), 41–87.
Briscoe, R. (2018). Gombrich and the Duck–Rabbit. In M. Beaney, B. Harrington, & D. Shaw (Eds.), Aspect perception after Wittgenstein: Seeing-as and novelty (pp. 49–88). London: Routledge.
Briscoe, R., & Grush, R. (2015). Action-based theories of perception. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy. https://www.plato.stanford.edu/archives/fall2015/entries/action-perception/.
Campbell, J. (2002). Reference and consciousness. Oxford: Oxford University Press.
Cavedon-Taylor, D. (2018). Sensorimotor expectations and the visual field. Synthese,. https://doi.org/10.1007/s11229-018-01946-4.
Chasid, A. (2014). Pictorial experience: Not so special after all. Philosophical Studies, 171, 471–491.
Clark, A. (2001). Visual experience and motor action: Are the bonds too tight? The Philosophical Review, 110, 495–519.
Clark, A. (2007). ‘What reaching teaches: Consciousness, control, and the inner Zombie. British Journal for the Philosophy of Science, 58, 563–594.
Clark, A. (2009). Perception, action, and experience: Unraveling the golden braid. Neuropsychologia, 47, 1460–1468.
Erkelens, C. J. (2013). Virtual slant explains perceived slant, distortion, and motion in pictorial scenes. Perception, 42(3), 253–270.
Ferretti, G. (2016a). Pictures, action properties and motor related effects. Synthese, Special Issue: Neuroscience and Its Philosophy, 193(12), 3787–3817. https://doi.org/10.1007/s11229-016-1097-x.
Ferretti, G. (2016b). Through the forest of motor representations. Consciousness and Cognition, 43, 177–196. https://doi.org/10.1016/j.concog.2016.05.013.
Ferretti, G. (2016c). Visual feeling of presence. Pacific Philosophical Quarterly. https://doi.org/10.1111/papq.12170.
Ferretti, G. (2017a). Pictures, emotions, and the dorsal/ventral account of picture perception. Review of Philosophy and Psychology, 8(3), 595–616. https://doi.org/10.1007/s13164-017-0330-y.
Ferretti, G. (2017b). Are pictures peculiar objects of perception? Journal of the American Philosophical Association. https://doi.org/10.1017/apa.2017.28.
Ferretti, G. (2017c). Two visual systems in Molyneux subjects. Phenomenology and the Cognitive Sciences. https://doi.org/10.1007/s11097-017-9533-z.
Ferretti, G. (2018). The neural dynamics of seeing-in. Erkenntnis, 84, 1285–1324.
Ferretti, G. (2019a). Perceiving surfaces (and what they depict). In B. Glenney & J. F. Silva (Eds.), The senses and the history of philosophy (pp. 308–322). London: Routledge.
Ferretti, G. (2019b). Visual phenomenology versus visuomotor imagery: How can we be aware of action properties? Synthese. https://doi.org/10.1007/s11229-019-02282-x.
Ferretti, G. (2020a). Why Trompe l’oeils deceive our visual experience. The Journal of Aesthetics and Art Criticism, 78–1, 33–42.
Ferretti, G. (2020b). Anti-intellectualist motor knowledge. Synthese. https://doi.org/10.1007/s11229-020-02750-9.
Ferretti, G. (2020c). Do Trompe l’oeils look right when viewed from the wrong place? The Journal of Aesthetics and Art Criticism, 78(3), 319–330. https://doi.org/10.1111/jaac.12750.
Ferretti, G. (2020d). Action at first sight. In G. Ferretti & B. Glenney (Eds.), Molyneux’s question and the history of philosophy (pp. 284–299). London: Routledge.
Ferretti, G. (2021). A distinction concerning vision-for-action and affordance perception. Consciousness and Cognition. https://doi.org/10.1016/j.concog.2020.103028.
Ferretti, G. (Forthcoming). On the Content of Peripersonal Visual Experience. Phenomenology and the Cognitive Sciences.
Ferretti, G., & Marchi, F. (2020). Visual attention in pictorial perception. Synthese. https://doi.org/10.1007/s11229-020-02873-z.
Ferretti, G., & Zipoli, C. S. (2019). Between vision and action. Introduction to the special issue. Synthese. https://doi.org/10.1007/s11229-019-02518-w.
Gallagher, S. (2020). No yes answers to Molyneux. In G. Ferretti & B. Glenney (Eds.), Molyneux’s question and the history of philosophy (pp. 235–249). London: Routledge.
Gerhard, T. M., Culham, J. C., & Schwarzer, G. (2016). Distinct visual processing of real objects and pictures of those objects in 7- to 9-month-old infants. Frontiers in Psychology, 7, 827. https://doi.org/10.3389/fpsyg.2016.00827.
Gibson, J. J. (1979/1986). The ecological approach to visual perception. Boston: Houghton Mifflin.
Gregory, R. L. (2012). Pictures as strange objects of perception. In F. G. Barth, P. Giampieri-Deutsch, & H.-D. Klein (Eds.), Sensory perception: Mind and matter (pp. 175–181). Amsterdam: Springer.
Grzeczkowski, L. A., Clarke, M., Francis, G., Mast, F. W., & Herzog, M. H. (2017). About individual differences in vision. Vision Research, 141, 282–292.
Hecht, H., Schwartz, R., & Atherton, M. (Eds.). (2003). Looking into pictures: An interdisciplinary approach to pictorial space. Cambridge: Cambridge MIT Press.
Hopkins, R. (2012). Seeing-in and seeming to see. Analysis, 72, 650–659.
Koenderink, J. J., van Doorn, A. J., Kappers, A. M. L., & Todd, J. T. (2004). Pointing out of the picture. Perception, 33, 513–530.
Koenderink, J. J., & van Doorn, A. J. (2003). Pictorial space. In H. Hecht, R. Schwartz, & S. Atherton (Eds.), Looking into pictures: An interdisciplinary approach to pictorial space (pp. 239–299). Cambridge: MIT Press.
Kulvicki, J. (2006). On images. Oxford: Clarendon Press.
Lopes, D. M. (2005). Sight and sensibility: Evaluating pictures. Oxford: Oxford University Press.
Maniatis, L. M. (2017). The Bathtub illusion. In A. Shapiro & D. Todorović (Eds.), The Oxford compendium of visual illusions (pp. 238–240). Oxford: Oxford University Press.
Matthen, M. (2005). Seeing, doing and knowing: A philosophical theory of sense perception. Oxford: Oxford University Press.
Milner, A. D., & Goodale, M. A. (1995/2006). The visual brain in action. 2nd edn. Oxford: Oxford University Press.
Nanay, B. (2010). Transparency and sensorimotor contingencies: Do we see through photographs? Pacific Philosophical Quarterly, 91, 463–480.
Nanay, B. (2011a). Perceiving pictures. Phenomenology and the Cognitive Sciences, 10, 461–480.
Nanay, B. (2011b). Do we see apples as edible? Pacific Philosophical Quarterly, 92, 305–322.
Nanay, B. (2012a). The philosophical implications of the Perky experiments. Analysis, 72, 439–443.
Nanay, B. (2012b). Action-oriented perception. European Journal of Philosophy, 20, 430–446.
Nanay, B. (2013). Between perception and action. Oxford: Oxford University Press.
Nanay, B. (2015). Trompe l’oeil and the dorsal/ventral account of picture perception. Review of Philosophy and Psychology, 6, 181–197.
Nanay, B. (2016). Aesthetics as philosophy of perception. Oxford: Oxford University Press.
Nanay, B. (2017). Threefoldness. Philosophical Studies. https://doi.org/10.1007/s11098-017-0860-2.
Nanay, B. (2018). Perception is not all-purpose. Synthese. https://doi.org/10.1007/s11229-018-01937-5.
Noë, A. (2004). Action in perception. Cambridge: The MIT Press.
Noë, A., (2012). Varieties of Presence. Cambridge, MA: Harvard University Press.
O’Regan, J. K., & Noë, A. (2001). A sensorimotor account of vision and visual consciousness. Behavioral and Brain Sciences, 24, 939–1031.
Papathomas, T. V., Kourtzi, Z., & Welchman, A. E. (2010). Perspective-based illusory movement in a flat billboard—An explanation. Perception, 39(8), 1086–1093.
Pettersson, M. (2011). Seeing what is not there: Pictorial experience, imagination and non-localization. British Journal of Aesthetics, 51, 279–294.
Pirenne, M. H. (1970). Optics, painting, and photography. Cambridge: Cambridge University Press.
Shapiro, A., & Todorović, , D. (2017). The Oxford compendium of visual illusions. Oxford: Oxford University Press.
Todorovic, D. (2008). Is pictorial perception robust? The effect of the observer vantage point on the perceived depth structure of linear perspective images. Perception, 37(1), 106–125.
Vasari, G., (1568). Vite de’ più eccellenti pittori, scultori ed architettori. Vol. 2, Firenze, Giunti.
Vishwanath, D. (2011). Information in surface and depth perception: Reconciling pictures and reality. In L. Albertazzi, G. J. van Tonder, & D. Vishwanath (Eds.), Perception beyond inference. The information content of visual processes (pp. 201–240). Cambridge, MA: MIT Press.
Vishwanath, D. (2014). Toward a new theory of stereopsis. Psychological Review, 121, 151–178.
Vishwanath, D., & Hibbard, P. (2010). Quality in depth perception: The plastic effect. Journal of Vision, 10, 1673–1685.
Vishwanath, D., & Hibbard, P. (2013). Seeing in 3D with just one eye: Stereopsis in the absence of binocular disparities. Psychological Science, 24, 1673–1685.
Vishwanath, D., Girshick, A. R., & Banks, M. S. (2005). Why pictures look right when viewed from the wrong place. Nature Neuroscience, 8, 1401–1410.
Voltolini, A. (2013). Why, as responsible for figurativity, seeing-in can only be inflected seeing-in. Phenomenology and the Cognitive Sciences, 14(3), 651–667.
Wollheim, R. (1980). Seeing-as, seeing-in, and pictorial representation. In Art and its object (2nd ed., pp. 205–226). Cambridge: Cambridge University Press.
Wollheim, R. (1998). On pictorial representation. Journal of Aesthetics and Art Criticism, 56, 217–226.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ferretti, G. Why the Pictorial Needs the Motoric. Erkenn 88, 771–805 (2023). https://doi.org/10.1007/s10670-021-00381-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10670-021-00381-1