Philosophizing cannot substitute for experimentation: comment on Hoffman, Singh & Prakash (2014)
The perception of a 3D shape must be excluded from Hoffman et al.’s “interface theory” primarily because shape is characterized by its symmetries. When these symmetries are used as a priori constraints, 3D shapes are always recovered from 2D retinal images veridically. These facts make it clear that 3D shape perception is completely different from, as well as more important than, all other perceptions because the veridicality of our perception of 3D shapes (and 3D scenes) accounts for our successful adaptation to the natural environment.
KeywordsVision Shape symmetry
I am an “expert” on 3D shape perception, so I will confine my comments about veridicality to shape. Unlike Hoffman et al., I do not have the temerity required to discuss veridicality in vision in general, as well as in smell, taste, touch and hearing in such varied species as humans, bees, and spiders. Furthermore, before I can comment about the “veridicality question” raised by Hoffman et al., I must be sure that the reader knows that the property called “shape” is unique (Pizlo, 2008). It is unique because of its “complexity” (i.e., the number of dimensions characterizing shape is orders of magnitude greater than the number of dimensions characterizing any other property of visual stimuli), and because of its “symmetry” (self-similarity). The reader also needs to know that it is the shape of 3D objects that conveys the information we all use to perceive our world veridically (see Pizlo, 2008; Pizlo et al., 2014). Specifically, the 3D symmetrical shapes of objects allow us not only to perceive the shapes themselves, veridically, but also to perceive the sizes, positions, orientations, and distances among the objects veridically. This means that when we talk about veridical 3D perception, we are referring to our natural visual space, that is, a space with natural, symmetrical objects residing, as they naturally do, on a common ground because of gravity. The concept of an “empty” visual space, used in laboratories, that contains only a few isolated points of light in total darkness, or a few objects floating in the air, no matter how attractive mathematically, is actually empty from an empirical, as well as from a computational, point of view. It is empty because the operation of our visual system depends critically on the system’s a priori knowledge of abstract, but natural, regularities present in the physical world. It is also important to appreciate that outside of 3D vision there are not many, if any, abstract regularities of the physical world that could be known a priori, and used to perceive a posteriori. It follows then that there is no logical connection, whatsoever, between the veridicality of human 3D vision and veridicality in the other senses and other species. Note that the concept of a priori constraints is largely ignored by Hoffman et al. Their neglect of such potentially critical factors allows them to treat such sensations as the taste called “sweetness” and the perception of 3D shape as being two instances of the same perceptual mechanism. There is absolutely nothing in the literature to justify such a view. Quite the opposite is true: there is overwhelming evidence showing that 3D vision, but not the taste sweetness, is treated as an inverse problem by the perceptual system and is solved as such (Pizlo, 2001; Poggio et al., 1985).
With this said, I will continue my comment by pointing out that the veridicality of visual perception is an empirical question, and visual psychophysics has advanced to the point that we have at hand, and have already used, all of the tools required to answer this question. Speculation about this now is pointless. Several relatively sophisticated experiments on how people see 3D shapes have provided a convincing positive answer to the veridicality question (Pizlo et al., 2010). This, of course, should be sufficient to end all discussion about whether the perception of shape is, or can be, veridical. It is, so there is really nothing left for philosophers or computer scientists to worry about. Ideally, they would stay away from it completely because continuing to fuss about veridicality can only muddy the waters. It would really be a shame to do this now because it took such a long time to separate the wheat from the chaff in the long history of research on shape perception: this took at least 300 years (Pizlo, 2008). The time-honored veridicality issue is, in fact, much less mysterious now than it appeared to be as recently as 30 years ago (Marr, 1982) because we were able to build a machine (a computational model) that sees much as we do (Pizlo et al., 2014). We now know that our robot perceives 3D shapes veridically just as we do. This empirical evidence also agrees with our common sense. The common sense argument goes as follows: Most objects “out there” are symmetrical and we perceive them as such. Animal bodies are symmetrical because of the way the animals move. Plants are symmetrical because of the way they grow. Man-made objects are symmetrical because of the function they serve (Li et al., 2013). The fact that we see symmetrical shapes as symmetrical means that we see them veridically, and if an object is not exactly symmetrical, like a chair with a broken leg, we see it as a symmetrical object with one broken part. If this is not veridicality, what is?
I have been presenting such “symmetry” arguments periodically to the authors of the target paper since 2009. It was already obvious to me back then that their interface theory does not apply to shape perception, no matter how well it applied elsewhere, but they simply could not put their theory aside. They did recognize that shape and symmetry made trouble for it, and the only way they could keep applying their non-veridicality theory to the perception of symmetrical shapes was to claim that it is not our perceptions, but our actions that are symmetrical. This is an unfortunate attempt to revive a type of theory that had been dead, for good reason, for many years – namely, a Motor Theory of Perception. It goes like this: when I look at a symmetrical chair, I know how to sit on it symmetrically, and it is this sitting process, or the intention to sit this way, that provides the basis for visual perception. Sitting is seeing. Bishop Berkeley is probably the first to have come up with this idea when he proposed his motor theory of binocular distance perception. Behaviorists, in the first half of the 20th Century, made a similar claim when Watson said that thinking is implicit speech: "thought processes are really motor habits in the larynx" (1913, p. 174). Any contemporary use of a Motor Theory of Perception, and this includes Hoffman et al.’s explanation of 3D shape perception, can be viewed as a legacy of psychology’s Dark Age called “Radical Behaviorism”. This is precisely what Hoffman et al. are offering us. As for me, “no, thank you very much.”
The author thanks Robert M. Steinman for suggestions about this comment. Writing this comment was partially supported by the NIH grant titled “Mechanisms responsible for veridical visual perception” (1R01EY024666-01).
- Hoffman, D. D., Singh, M., & Prakash, C. (2014). The interface theory of perception. Psychonomic Bulletin and Review. In press.Google Scholar
- Marr, D. (1982). Vision. New York: W. H. Freeman.Google Scholar
- Pizlo. (2008). 3D shape: Its unique place in visual perception. Cambridge, MA: MIT Press.Google Scholar