The dorsal stream and the visual horizon
- First Online:
- Cite this article as:
- Madary, M. Phenom Cogn Sci (2011) 10: 423. doi:10.1007/s11097-011-9214-2
- 187 Views
Today many philosophers of mind accept that the two cortical streams of visual processing in humans can be distinguished in terms of conscious experience. The ventral stream is thought to produce representations that may become conscious, and the dorsal stream is thought to handle unconscious vision for action. Despite a vast literature on the topic of the two streams, there is currently no account of the way in which the relevant empirical evidence could fit with basic Husserlian phenomenology of vision. Here I offer such an account. In this article, I show how the empirical evidence ought to be understood in a way that is informed by phenomenology. The differences in the two streams are better described as differences in spatial and temporal processing. Rather than simply “unconscious,” the dorsal stream can be better described as making a special contribution to what Husserl identified as the visual horizon.
KeywordsTwo visual systemsPerceptionPhenomenology
There is a large body of evidence from cognitive neuroscience which supports a distinction between two cortical processing streams in the human visual system. The two cortical systems are the ventral stream, which projects from primary visual cortex to inferotemporal cortex, and the dorsal stream, which projects from primary visual cortex to posterior parietal cortex. The evidence regarding the two streams has been used in support of theories about visual consciousness. The most influential of these theories has been developed by David Milner and Melvyn Goodale (1995, 2005), who have focused on the functional output of the two streams. They have argued that the ventral stream is devoted to “vision for perception,” which can contribute to conscious experience, and the dorsal stream is devoted to “vision for action,” which cannot in principle contribute to consciousness (2010: 75). Their theory and most of its rivals are driven by empirical evidence, with little regard for the phenomenology of visual experience.1 As a result of this methodology, there is currently no account of visual consciousness which accommodates both the empirical evidence about the two cortical streams and the basic tenets of the Husserlian phenomenological tradition. Not only is there no such account, but the existing theories appear to be in tension with the phenomenological tradition.
For instance, Husserl understood visual perception as an ongoing process involving the anticipation of the sensory consequences of bodily action, and the fulfillment of that anticipation. Merleau-Ponty followed up on this insight and focused on the interdependence between action and perception. At least prima facie, this way of understanding vision is at odds with Milner and Goodale's functional distinction between “vision for action” and “vision for perception.”
The problem, as I see it, is as follows: there is a wealth of empirical evidence regarding the two streams and their roles in visual experience, but there is no theoretical framework for this evidence which is sensitive to basic Husserlian phenomenology. In this paper, I will try to provide such a framework. The most relevant insight from Husserl is that visual phenomenology always includes a spatial and temporal fringe, or horizon. There is always an indeterminate periphery in space, and there is always anticipation of the next instant in time. These features of visual phenomenology are neglected in some of the most well-known literature on the two visual streams. But, as I intend to show, it is precisely these features which best explain the differences between the two streams.
I am going to sketch the evidence that the crucial difference between the two cortical streams is in their spatiotemporal processing, rather than functional output: the dorsal stream processes peripheral retinal input with a high temporal resolution, and the ventral stream specializes in foveal input with less temporal resolution. These suggestions about input differences can be found in the existing empirical literature, but there is yet no way to understand them in relation to conscious visual perception. That's where Husserl comes in. The contribution of this paper is to show how Husserl's more sophisticated phenomenology can actually help us make sense of the disparate bits of empirical evidence. One way to express my main thesis, then, is as follows: dorsal stream processing plays a main role in the spatiotemporal limits of visual perception, in what Husserl identified as the visual horizon.
In addition to providing a new, phenomenologically motivated, interpretation of the empirical evidence, my thesis can clear up some areas of disagreement in the existing literature. For instance, Milner and Goodale are at odds with Yves Rossetti, Laure Pisella, and their colleagues over the way to describe the role of the dorsal stream (Pisella et al. 2006; Rossetti et al. 2010; Milner and Goodale 2010). The former maintain that dorsal processing is “vision for action” in the “here and now.” The latter have emphasized that dorsal processing is devoted to peripheral vision. The account that I develop offers a synthesis of elements from these two approaches.
Importantly for our purposes, the outputs of the dorsal system are unconscious, while those of the ventral system are phenomenally conscious (in humans). (Carruthers 2005: 200)
[Noë's enactive view] would still clash with the facts about the two visual systems, since the enactive view would dictate that the (in fact unconscious) dorsal states are conscious. (Block 2005: 270)
My task here is neither to evaluate higher-order theories of consciousness, nor to defend Noë's enactivism against Block's critique. Instead, I hope to show that there is a good alternative to the received understanding of the two visual systems, to the understanding of dorsal output as unconscious vision for action and ventral output as conscious vision for perception. As a part of my phenomenological interpretation of the empirical evidence, I suggest that dorsal processing can contribute to conscious experience in the form of the visual horizon.
The neurophysiological division of cortical visual processing into two streams is widely accepted in the current literature (Gangopadhyay et al. 2010), but there are important voices of dissent. Careful hodological work in non-human primates suggests that there are three, rather than two, cortical streams (Rizzolatti et al. 1998; Rozzi et al. 2006; Gallese 2007). In particular, the suggestion is that the dorsal stream should be further divided into a dorso-dorsal stream and a ventro-dorsal stream. Gallese suggests that “The dorso-dorsal stream has the characteristics suggested by Milner, Goodale, and Jeannerod when they describe the dorsal stream as a whole” (2007: 3). Thus, the points I make about the spatiotemporal properties of dorsal processing can be taken as applicable to the dorso-dorsal stream. A discussion of the ventro-dorsal stream must be left for further research.
The article will consist of four parts. In the first part I will illustrate the phenomenological description of the visual horizon. In the second part I will outline the neurophysiological and anatomical evidence that input to the dorsal stream differs in spatiotemporal properties from input to the ventral stream. In the third part of the paper, I will outline the relevant evidence from localized cortical damage and visual illusions. In the fourth part of the paper, I will mention how my main claim finds support in models which include anticipation in the neural dynamics of the visual system.
Here I should make two preliminary and related methodological notes. First, I intend this article as a contribution to the ongoing dialogue between the natural sciences and the phenomenological tradition (Zahavi 2010). In this case, I suggest that phenomenological results offer insight into a better way of accounting for the empirical evidence—a way that is sensitive to the a priori structure of visual experience. Since the theoretical framework proposed here also generates an empirical prediction (in the section Computational models of dorsal anticipation), it is also an instance of what Shaun Gallagher has called “front-loaded phenomenology” (Gallagher 2003). I am not committed to the strong reductive claim that dorsal processing constitutes an element of experience, though my thesis is not incompatible with such a reductive approach. To the extent possible, I intend my thesis to remain neutral on the metaphysics of mind, though I am not insensitive to the importance and difficulty of these issues.
The second methodological point is for readers familiar with some recent literature at the intersection of philosophy and neuroscience. As above, I would like to avoid metaphysical commitment to the relationship between neural activity and mental states. Many philosophers are attracted to the idea that areas of cortex are core realizers of mental content (Chalmers 2000; Block 2005). Others suggest that content attribution only makes sense when considering the entire organism in its environment (Hurley 1998; Noë and Thompson 2004; Thompson 2007). I intend my exploration of the dorsal stream's role in visual perception to be compatible with both of these views.
The phenomenology of the visual horizon
Milner and Goodale famously suggested that dorsal processing is for action (and is unconscious) while ventral processing is for (conscious) perception. As noted above, some philosophers have uncritically embraced the dichotomy between vision and action (Block 2005; Carruthers 2005: 72). Other philosophical traditions, though, have long opposed such a dichotomy.2 The close link between perception and self-generated movement was first discovered in the Western philosophical tradition by Aristotle (1957) (particularly in De Anima). More recently, this link has been explored in detail for over a century now within the Husserlian phenomenological tradition. Given the assumption that these philosophical traditions contain valuable insight regarding action and perception, we have a good motivation to resist Milner and Goodale's dichotomy. What I hope to demonstrate here is that there is a way to accommodate the empirical evidence without following Milner and Goodale in placing a wedge between “vision for action” and “vision for perception.” But first, here is some more detail about Husserl's reasons for linking perception and action.
It is well known that Husserl paid close attention to the fact that perception occurs from a single perspective; objects show up for us in adumbrations (Abschattungen). The way in which we perceive perspective-independent properties of objects is by implicitly anticipating how the perspectival appearances will change as we move. Nearly always, our anticipations are fulfilled by new sensations brought about through movement. When anticipations are not fulfilled, we are surprised and must reconcile the surprising appearance with previous appearances. This account of perception was first presented in Husserl's Logical Investigations (1901/1993), and can be found throughout his later work with little change.3
Husserl's general structure of perception brings three important points. First, visual perception and action are intimately related. We continuously move our eyes and our bodies in visual perception (Hua XVI; Hurley 1998; Noë 2004; Findlay and Gilchrist 2003). When we are moving, we perceive by anticipating the consequences of those movements. This ongoing cycle of movement, anticipation, and fulfillment reveals the close connection between perception and action. Second, the role of action in visual perception highlights the ever-changing indeterminate spatial fringe of our experience (more on this shortly). Third, the notion of anticipation incorporates a temporal fringe of visual perception: according to Husserl, all perception essentially involves the anticipation of the immediate future. To sum up: action and perception are closely linked through the cycle of anticipation and fulfillment, and this cycle always includes a spatial and temporal fringe. Following Husserl, we can refer to this spatial and temporal fringe as the horizon.
Husserl uses “horizon” (Horizont) in a number of different ways in his corpus. Anthony Steinbock identifies three senses of the term: visual horizon, substantive horizon, and transcendental horizon (Steinbock 1995: 105 and 106). It is only the first sense of horizon, the visual horizon, which will concern us here. The visual horizon is spatial because of peripheral indeterminacy, and it is temporal because it involves experiences that are possible in the most immediate future. As Steinbock explains, “the horizon is not only conceived as a spatial halo, but also as a temporal court or fringe projected by the object” (1995: 105, also see Hua III: 58). The visual horizon is a constant feature of visual experience and it includes both a spatial as well as a temporal fringe.
Take a deck of playing cards and remove a card face down, so that you do not yet know which it is. Hold it out at the left or right periphery of your visual field and turn its face to you, being careful to keep looking straight ahead (pick a target spot and keep looking right at it). You will find that you cannot tell even if it is red or black or a face card. . . . Now start moving the card toward the center of your visual field, again being careful not to shift your gaze. At what point can you identify the color? At what point the suit and number? (Dennett 1991: 53–54).
The region of clearest vision is so small and the clarity shades off so quickly, that, in general, every image actually extending beyond this smallest region will undergo changes in clarity in the case of movement, and so all the appearances, as they progress, will become richer in explication. (Hua XVI: 340, Rojcewicz trans. 1997: 294)
Both Husserl and Dennett are describing the way in which the human visual field has a small point of clarity (corresponding to the fovea on the retina) which is surrounded by a horizon of indeterminacy.
In addition to the spatial fringe, visual perception includes a temporal fringe of possible percepts. Husserl's details on the temporal structure of perception would take us too far afield, but the important point is that the most basic elements of visual perception always occur within a temporal structure. For Husserl, perception and temporality are profoundly interrelated. Every visual sensation “now” is always accompanied by just past (retention) and anticipated future (protention) sensations (Husserl 1999: section 64b, for example). The implicit anticipation of how appearances will or could change lies at the temporal horizon of visual perception. This horizon is the indeterminate border between present and future possible visual experiences. As time unfolds, the indeterminate anticipation of how a novel object will look from a hidden side can become determinately fulfilled as one moves around to view the object from the previously hidden side. Thus, Husserl describes the horizon as a “determinable indeterminacy” (Hua XI section 1).
The main point of this section of the article is that there is an indeterminate spatial and temporal horizon to vision. Dennett's example reveals the spatial indeterminacy and the notion of visual anticipation shows temporal indeterminacy. In what follows, I shall take it as accepted that there is an indeterminate spatial and temporal horizon to vision. Note, though, that not all philosophers would accept this claim. Ned Block, for instance, understands cognitive access to phenomenal consciousness a binary phenomenon: either we are conscious of something or we are not; there is no in between. Here is not the place to engage with Block's project in depth, but I will mention that his recent target article was challenged by Robert Van Gulick's appeal to Dennett's playing card example (Van Gulick 2007: 528–529). Block responded with evidence that the attentional blink in psychology is a binary phenomenon (Block 2007: 533). This response is unconvincing for at least two reasons. First, Block offered no reason to think that the attentional blink generalizes to all visual experience. Second, he did not directly respond to the playing card example, which is a robust demonstration of phenomenal indeterminacy.
Input to the dorsal stream
Now that the general phenomenological theme of the article has been introduced, here is the neurophysiological and anatomical justification for my main claim. The last few decades of work on the physiology of the primate visual system has shown that the input to the dorsal stream differs from the input to the ventral stream.4 The most relevant differences for present purposes are that the dominant input to the dorsal stream is processed faster and is less foveally concentrated than input to the ventral stream. Recent results have made it clear that we should not oversimplify the differences between the inputs to the two streams, but all parties seem to agree that there are differences nonetheless. Here is a summary of the relevant evidence regarding these differences. Readers who are not interested in the neurophysiological details may wish to skip to the next section of the article.5
There are two major types of parallel pathways from the retina to the thalamus and then on to cortex in primates.6 The magnocellular pathway projects from the retina to layers 1–2 of the lateral geniculate nucleus (LGN) and then terminates in layer 4Cα of the primary visual cortex. The parvocellular pathway projects from the retina to layers 3–6 of the LGN and then terminates in layer 4Cβ of the primary visual cortex (Kveraga et al. 2007; Nassi and Callaway 2006).
In their classic paper, Livingstone and Hubel (1988) reported four key differences (speed, contrast, color, and acuity) in processing characteristics of the two pathways as discovered mostly through anatomical and physiological studies in non-human primates. The response of the magnocellular pathway is faster and more transient than that of the parvocellular pathway, and the magnocellular pathway is more sensitive to low-contrast stimuli. The parvocellular pathway is sensitive to changes in wavelength, unlike the magnocellular pathway, and is thus responsible for color processing. Also, the parvocellular pathway has smaller receptive field centers on the retina, and thereby has a higher acuity than the magnocellular pathway. To summarize, the magnocellular pathway is faster and more sensitive to contrast, and the parvocellular pathway processes color and with better acuity.
What does all of this have to do with the dorsal and the ventral streams? Livingstone and Hubel suggested that “the temporal visual areas [the ventral stream] may represent the continuation of the parvo system, and the parietal areas [the dorsal stream] the continuation of the magno pathway” (Livingstone and Hubel 1988: 744). Subsequent research has revealed that this proposal is an oversimplification; the magnocellular and parvocellular do not map on to the dorsal and ventral streams in such a straightforward manner.7 Importantly, though, Livingstone and Hubel's suggestion is not completely false, either. Milner and Goodale report that “most of the input to the dorsal stream is magno in origin” (Milner and Goodale 1995: 36). More recently, Nassi and Calloway have filled in some details about the input to the dorsal stream. They focus on input to area MT (medial temporal), which “is primarily a processing station within the dorsal stream” (Milner and Goodale 1995: 50). Nassi and Calloway have found that input to MT “originates almost exclusively in [magnocellular] dominated layer 4Cα” (Nassi and Calloway 2006: 12792). They point out that MT probably receives parvocellular input as well, but this input is likely more indirect, and “may require additional synaptic relays” (ibid.). The main point to be taken from these details is the following: the magnocellular stream does not neatly map on to the dorsal stream, but it does constitute the dominant input to the dorsal stream. This finding is especially striking when one considers that the parvocellular stream “is tenfold more massive” than the magnocellular (Livingstone and Hubel 1988: 748).
Recall that magnocellular processing is fast and contrast sensitive, yet color blind and has less acuity. It is likely that this is the nature of the information processing that dominates the dorsal stream as well. These results motivate my claim that dorsal processing differs in its temporal properties from ventral processing. There is neurophysiological evidence that dorsal processing differs in spatial properties as well, evidence that retinal input to the magnocellular pathway is distributed across the retina in a different manner than input to the parvocellular pathway. To be more precise, the parvocellular pathway is especially concentrated on foveal input, whereas the magnocellular pathway is not. Using intracellular staining techniques on intact human retinas isolated in vitro, Dacey and Petersen (1992) investigated the dendritic field size of the retinal cells which input to each pathway. Although the density of the cells that input to both pathways increases towards the fovea, they think it is likely that the density of cells which input to the magnocellular pathway “increases more slowly approaching the central retina than does [density of cells which input to the parvocellular pathway]” (1992: 9669). Furthermore, they suggest that the cells which input to the parvocellular pathway outnumber the magnocellular input cells by roughly 30:1 in the fovea. If their conclusions are correct,8 the imbalance between magnocellular and parvocellular input is most extreme in the fovea. I should also mention here that there is evidence for cortical magnification of central vision in the ventral stream. Such magnification may be reduced in the dorsal stream (Milner and Goodale 2010; Colby et al. 1988; Brown et al. 2005)
To sum up these neurophysiological and anatomical details, the input to the beginning of the dorsal stream is not concentrated in the fovea, and this input is delivered to cortex faster than the input to the ventral stream, which receives high acuity input concentrated in the fovea. The differences in the processing based on retinal location continue in the cortex. Now I turn to the evidence from localized brain damage which supports my claim that the differences between the two streams are mainly in the nature of their spatiotemporal processing.
Localized damage and illusions
In this section of the article, I will focus on two important areas of evidence for the two visual streams. First, I will discuss the visual and visuomotor deficiencies brought about by localized damaged to the cortex in humans, and then I will discuss evidence from actions directed towards illusory stimuli.
For my discussion of localized brain damage, I will focus on the cases of patients D.F. and S.B., who both suffer from visual form agnosia caused by damage to the ventral stream. Also, I will discuss recent work with patients suffering from optic ataxia, a condition caused by damage to the dorsal stream. These cases of localized damage add more support for my suggestion that the dorsal stream is involved with processing of the spatiotemporal limits of visual perception, or of what Husserl would call the visual horizon.
The case of patient D.F. is well known from the work of Milner and Goodale over the last few decades. While in her early 30s, D.F. suffered bilateral damage to her inferotemporal cortex from carbon monoxide poisoning. This damage to the ventral stream has impaired her perceptual recognition abilities, but she is nonetheless able to perform, as normal, a variety of visually guided motor tasks. Milner and Goodale have appealed to evidence from D.F. to make the case that the ventral stream processes vision for perception and the dorsal stream processes vision for action. Their theory suggests that D.F. can still perform visually guided actions because of her intact dorsal stream.
Crucially, the kinds of actions that D.F. can perform are limited to actions of a particular temporal nature. As Milner and Goodale put it, D.F. is only able to perform actions in the “here and now” (1995: 137). Unlike normal subjects, when there is as little as a 2 second delay between her view of an object and the initiation of her grasp of that object, D.F.'s grip size does not correlate with the width of the object (Goodale and Keillor 1994). Milner and Goodale conclude that the dorsal stream processes vision for action, but only for “real-time” practiced actions. Motivated by the other lines of evidence herein, I would reverse the emphasis in the explanation. I have emphasized that the dorsal stream is marked by its particular dynamics, by processing what comes next visually. These dynamics are at work when we perform “real-time” practiced actions because it is such actions which push the temporal limits of visual perception. In other words, rather than say that the dorsal stream is concerned with actions, but only fast actions, I would say that the dorsal stream operates at a particular time scale, and that this time scale is especially useful when we grasp in a natural, practiced manner.
One might ask the following question at this point: if the dorsal stream plays a role in our perception of the visual horizon, and D.F. has an intact dorsal stream, then why does D.F. not consciously perceive the visual horizon? Perhaps she does consciously perceive the visual horizon, at least as much as anyone consciously perceives it. Note that D.F. does have conscious visual experiences of color and texture (Milner and Goodale 1995: 125–126). As Morgan Wallhagen has suggested, it is possible that D.F. also experiences other visual features, but that she is unable to conceptualize and thereby report on those features (Wallhagen 2007: 556). An excellent candidate for the kinds of perceptual content that would be unconceptualized—by both normal subjects and visual form agnosics—would be the spatial and temporal fringe of the visual horizon. I do not have the concepts to describe the indeterminate content of, for instance, Dennett's playing card in the visual periphery, or the sign on the side of the train passing the platform at full speed. Here is one place where Husserl is especially helpful: with care, we are able to describe some of the structure of visual phenomenology even though we might not pin down all the content in a satisfying manner. The structure of the visual horizon could very well be preserved in D.F. without her reporting any particular perceptual content because the visual horizon in normal subjects does not lead to reports of particular content. Admittedly, the case of D.F. alone is not sufficient to conclude that the dorsal stream makes, or can make, some contribution to phenomenal consciousness. In order to investigate this issue further, consider the case of S.B.
Perhaps the case of visual form agnosic S.B. is not as well known as D.F., but it is no less fascinating.9 Sandra Lê and colleagues have presented the case as follows (Lê et al. 2002). At the age of 3 years, a case of meningoencephalitis left S.B. with cortical damage more extensive than that of D.F. After the illness, S.B. lost both of his ventral streams as well as his left dorsal stream. D.F., in contrast, retained intact dorsal streams and the ventral damage was not total (Milner and Goodale 1995). S.B. represents a case of vision with only one dorsal stream. Another important difference between S.B. and D.F. is the age at which their damage occurred. Because S.B. was so much younger at the time of the damage, he likely had greater cortical plasticity as an advantage in recovery.
S.B. experiences no colors, and shows the expected range of deficits of ventral stream damage, including the inability to recognize objects and faces. The relevant question here is whether S.B. has conscious visual experience. The evidence indicates clearly that he must. Surprisingly, he is able to “drive a motorcycle and . . . easily catch two table tennis balls at the same time and juggle with them . . .” (Lê et al. 2002: 59). Also, he “is bothered by high luminance levels; he prefers to move within a low luminance world (dawn, night)” (Lê et al. 2002: 71). He has no problem moving about in an unfamiliar environment (ibid.). And this is all with only one dorsal stream.
At the very least, the case of S.B. shows that dorsal stream processing can contribute to conscious experience. Now, the early plasticity of S.B.'s cortex after the damage might mean that S.B.'s dorsal stream is connected in ways that are not to be found in normally developed subjects. Thus, it would be wrong to conclude that the dorsal contribution in normal subjects is precisely everything that S.B. experiences. The important point, though, is as follows. If and when the dorsal stream makes a contribution to conscious experience, we should not expect that it would be a contribution that could be easily described. Therefore, the fact that D.F. cannot report on properties that are probably processed by her intact dorsal stream does not entail the conclusion that she has no visual experience of those properties. She could experience them, but in the way that we experience the temporal and spatial horizon, not unlike the way in which S.B. visually experiences the world.
The final set of cases to mention here are cases of optic ataxia due to dorsal stream damage. Milner and Goodale have emphasized that D.F.'s intact dorsal stream can enable her to perform visually guided actions in the “here and now.” The temporal constraints on the nature of actions enabled by dorsal processing motivates my re-description of dorsal processing as having to do with the temporal limits of visual perception. I am also suggesting that the two cortical streams differ in the spatial nature of their processing. This claim finds support in recent articles by Yves Rossetti, Laure Pisella, and their collaborators. Based largely on their work with optic ataxics, they have argued that the dorsal stream processes peripheral information, while the ventral stream focuses on central vision. In particular, optic ataxics tend to perform normally, or nearly normally, on actions directed to objects in central vision (Rossetti et al. 2003; Pisella et al. 2006). This finding is consistent with what one might expect from the non-foveal nature of the magnocellular dominant input to the dorsal stream (section Input to the dorsal stream).
In response to this line of reasoning, Milner and Goodale (2010) have emphasized that there have been optic ataxics who have shown deficits in grasping in central as well as in peripheral vision (see Goodale et al. 1994). In line with my thesis, though, one can explain deficits in natural grasping by appealing to the temporal properties of dorsal processing. Natural grasping is a fast motion which would require the high temporal resolution of dorsal processing.
As Rossetti et al. indicate, patients with optic ataxia do not complain of general deficits in vision for action. Instead, they tend to be aware of their disability when they are unable to perform skillful actions quickly; for instance, they complain of “a slowness and clumsiness in writing” (Rossetti et al. 2003: 177). Also, optic ataxics sometimes report difficulty with the exploration of a new and complex environment, such as a busy train station (ibid.). This difficulty is what one might expect with a deficiency in peripheral vision, because such a deficiency could alter the pattern of saccades that one would normally make in exploring a novel and rapidly changing environment. To restate things in Husserlian terms, the visual horizon is especially valuable when exploring novel and dynamic environments. In such situations, we need to anticipate visually the way things will change and we need to detect unanticipated changes in the periphery. With damage to the dorsal stream, and, I suggest, a subsequently compromised visual horizon, optic ataxics have difficulty coping with such situations.
Another main source of evidence for suggesting a functional dichotomy between the two cortical streams comes from experiments using visual illusions. Perhaps the most well known of these experiments involve the Titchener circles. Aglioti et al. (1995) showed that normal subjects fall victim to the illusion perceptually, but that their grip aperture reflects the true (non-illusory) size of the circles. Milner and Goodale take this result as further evidence for a functional dichotomy between vision for action and vision for perception. The question of whether action falls victim to perceptual illusion has been pursued widely in the last couple of decades. Here is not the place to review the sizable literature, but I would like to make two quick points, which, I think, are fairly important to keep in mind.
The first point addresses the central/peripheral distinction. If there is a difference between ventral and dorsal processing which reflects central versus peripheral vision (Pisella et al. 2006), then it will be important to consider how the strength of illusions vary with position in the visual field. For instance, the peripheral drift illusion (made popular with Akiyoshi Kitaoka's “Rotating Snakes”) occurs only in the periphery. In contrast, when viewing the Titchener circles, subjects presumably saccade between the two sets of circles in order to bring each set into central vision. Indeed, the Titchener circles illusion is so subtle that it is not clear whether we can experience it beyond central vision, i.e. by fixating on a point which places both sets of circles in the periphery. Thus, it may not be precise enough to claim simply that an illusion fools conscious vision, since conscious vision can be central or peripheral.
The second point involves the dynamics of the experiments. The illusions occur when subjects are allowed to view the stimulus in an unhurried manner, but the grasp, which purportedly is not victim to the illusion, occurs quickly. Subjects are instructed to reach naturally, which de facto means to reach with some speed. Indeed, there is evidence that slowing the movement brings on the illusion (Rossetti et al. 2005; Króliczak et al. 2006). In addition, subjects who are instructed to reach in an awkward manner fall victim to the illusion. After subjects practice the awkward grip, the illusion no longer affects their reaching (Gonzalez et al. 2006, 2008). Assuming that increased skill means increased speed, these results further support my claim that it is fast vision, not vision for action, which is supported by dorsal processing.
So if—and this remains a matter of debate—there is a dissociation between experiencing the illusory stimulus, on one hand, and visually guided grasping of it, on the other, the dissociation is not between conscious vision for perception and unconscious vision for action. Rather, the dissociation is between judgments based on central vision, on one hand, and fast grasping movements, on the other. The illusions in question do not occur in fast visually guided actions, nor is there evidence that they occur in peripheral vision. Both of these abilities, fast movements and peripheral vision, are supported by dorsal processing, or so the evidence indicates. Thus, one could conclude that the dorsal stream is not fooled, as it were, by the illusion and still maintain that the dorsal stream is devoted to the limits of spatial and temporal vision, to the visual horizon.
In this section, I have tried to outline evidence from localized damage as well as the perception of illusions in normal subjects. I have argued that the key differences between the streams are spatiotemporal, rather than differences between action and perception. In the following section I will discuss some recent neurocomputational models of vision which also give the dorsal stream an important role in visual anticipation.
Computational models of dorsal anticipation
Up until now, I have only outlined the evidence that dorsal processing is faster and less foveally concentrated than ventral processing. But a part of the Husserlian framework is that the visual horizon is characterized by visual anticipation. In this section of the article, I will mention some recent models of visual anticipation that include the dorsal stream.10 These models are still somewhat theoretical, which means that the neurophysiological details are still being worked out.
“Brain connectivity is mostly bidirectional . . . [U]nder a theoretical generative model perspective on brain function, it is the backward connections that generate predictions and the forward connections that convey the traditional feedback, in terms of mismatch or prediction error signals.” (Logothetis 2008: 872)
These models of neural dynamics suggest that anticipation is distributed and widespread throughout the massive feedback connections in cortex and thalamus. What I have emphasized here, though, is that visual processing occurs on two distinct time scales, with the magnocellular stream delivering input to cortex slightly faster than the parvocellular stream. Given this fact, it seems reasonable that the faster processing scale might play a special role in providing anticipatory feedback for the slower, but more accurate, processing in the ventral stream. Indeed, several variations on this idea have been developed.
Combining evidence from visual masking and neurophysiology, Haluk Ogmen and Bruno Breitmeyer have developed a retinocortical dynamics model in which visual perception always involves a succession of three temporal phases: feedforward dominant, feedback dominant, and reset (Ogmen 1993; Ogmen et al. 2006). The time differences in the magno- and parvocellular streams play a key role in determining temporal phases in visual perception. Although the focus of this model is subcortical, the model could be compatible with the suggestion that anticipatory feedback occurs in the cortical dorsal processing as well.
The network employed in top-down facilitation of object perception may be part of an older system that evolved to quickly detect environmental stimuli with emotional significance. This primarily may involve scanning the environment for threat and danger cues, but also could include the detection of other survival-related stimuli, mating or food-related cues. (Kveraga et al. 2007: 160, emphasis added)
This suggestion lends further support for my thesis that it is important to consider spatial and temporal dynamics when comparing the two cortical processing streams.
The final model of anticipatory neural dynamics is not terribly unlike Bar's, and, of the three here, it fits best with my main thesis that the dorsal stream plays a key role in the visual horizon. In a series of articles, Jean Bullier (Nowak and Bullier 1997; Bullier 2001a, b) has made the case that the magnocellular stream, as well as areas in parietal and frontal cortex, including the dorsal stream, constitutes what he calls the “fast brain” system. This system provides feedback to earlier areas V1 and V2 in order to facilitate processing of information delivered by the parvocellular stream. Such a functional role for the dorsal stream in the “fast brain” could naturally be understood in terms of the temporal horizon.
One final point to mention here is that these models generate a novel empirical prediction. If, as suggested, the dorsal stream plays some role in visual anticipation, then dorsal stream damage should bring about a decrease in object recognition speed. Evidence of such a decrease would be further evidence in support of my thesis.
The currently dominant understanding of the neurophysiological distinction between visual processing streams in cortex is based on the distinction between action and perception. What I have tried to show here is that there is another option, an option which goes far in accommodating the empirical evidence. This other option is inspired by philosophical work which maintains a close link between action and visual perception. The alternative, based on Husserl's concept of the visual horizon, is that the difference between the two streams is chiefly a difference in spatiotemporal processing. That is, the dorsal stream deals with fast processing of peripheral information and the ventral stream deals with slower processing of foveal information. Also, as others have suggested, the dorsal stream may be a part of anticipatory visual activity. Such visual anticipation would further complement the Husserlian model.
For some examples of Milner and Goodale's main hypothesis influencing discussions of visual consciousness, see Crick and Koch (1998: 98) and Chalmers (2000: 21). Also note that some of Andy Clark's work on this topic supposes that Milner and Goodale are correct in emphasizing the dichotomy between conscious vision for perception versus unconscious vision for action. In his influential article from 2001, for example, both the assumption of Experience Based Control (EBC) and the hypothesis of Experience Based Selection (EBS) are formulated without mention of the temporal and spatial scales at play in conscious experience.
For some recent philosophical work that challenges the dichotomy, see Noë (2004), Schellenberg (2008), and Briscoe (2009). Of course, the Gibsonian ecological tradition in psychology could also motivate criticism of the dichotomy.
My comments here are about the physiological nature of the inputs to the two streams, and I am not going to enter the somewhat large debate over egocentric versus allocentric coding in the two streams. For a treatment of this issue from a philosophical perspective, see Briscoe (2009).
There is also a third koniocellular pathway that is not as well understood as the other two major pathways. Also, the koniocellular pathway includes far less cells than the other two (Kveraga et al. 2007; Callaway 2005).
For a review of the challenges to their proposal, see Milner and Goodale (1995: 34–36). Some important articles on this topic include Schiller and Logothetis (1990), Merigan and Maunsell (1993), and more recently Nassi and Calloway (2006, 2009).
I am not suggesting here that visual anticipation is exclusively enabled by the dorsal stream. Thanks to Nivedita Gangopadhyay for this point.
I am grateful for the helpful comments I received when presenting this material at the Center for Subjectivity Research in Copenhagen in May 2010. My research for this article was supported by a collaborative research project on Consciousness in a Natural and Cultural Context (CONTACT), which was coordinated by the European Science Foundation (ESF).