A number of perceptual (visual and proprioceptive) illusions seemingly present problems for PEM accounts of perception. For example, in the MLI two lines are perceived as having different lengths despite the fact that they are equal (Fig. 1).
That illusory experience persists even when we know that the two lines are of equal length. In spite of the open communication principle, where top–down and bottom–up exchanges are said to minimize prediction errors, perceptual illusions seemingly allow prediction errors to rule. Even if our priors include reliable and secure knowledge that the lines in the MLI are equal, the system seems unable to correct the sensory errors that form the illusion.
One classic way to explain perceptual illusions is to appeal to cognitive impenetrability. Traditional modularist accounts, for example, hold that sensory processes are modular and informationally encapsulated such that they provide input to the system, but do not receive input from central cognitive processes (Fodor 1983). This is a simple denial of open communication. Sensory modules involve a closed system of information processing; they block or filter out any information coming top–down from processes based on wider belief or knowledge. I may know that the lines in the MLI are indeed equal, but this knowledge does not penetrate the modular structure of the early visual processing areas, and I continue to see the lines as unequal.
In contrast, for predictive processing, open communication should mean that the knowledge gained from our prior experience with the illusion, or from being informed about the true length of the lines, will correct our perception, and we should be able to see that the lines are equal. Instead they continue to appear as unequal. We can measure the lines and adjust our hypothesis, which should then correct any prediction errors found in our sensory experience; but this attempted minimization of prediction errors fails. Likewise, with the RHI,Footnote 4 my model of the world, and thereby my prediction, is that this rubber hand is not part of my body, but we seem unable to eliminate the prediction errors coming from the combined tactile-visual stimulus (Fig. 2).
In the RHI the visual input dominates the proprioceptive sense of where my hand is located. Similarly, in the AHI,Footnote 5 the visual of what is supposedly my hand in the act of misdrawing a straight line from A to B dominates the kinaesthetic sense of what my hand is actually doing (Fig. 3). I can know all the details of the experiment, and that I am actually seeing the hand of the experimenter in a mirror, and so know full well that the hand that I see is not my hand, and that the movement is not my movement; but the effect (which is a weird feeling that my hand is doing something other than I want it to) persists (Gallagher and Sørensen 2006).
Even in the context of discussing such illusions, the open communication in the cognitive system is often portrayed in a straight-forward manner. A recent example can be found in Matamala-Gomez et al. (2020). Here is how it works, according to them.
Conceptually, the [PEM process] assesses the improbability (surprise) of the sensory information under a hierarchical generative model…. In the case of body ownership illusions [for example, the RHI], altered body perception results from a self-representation that is updated dynamically by the brain to minimize sensory conflicts (i.e., the differences between the predictions about sensory data and the real sensory data at any level of the hierarchical model). Then, [with respect to] body ownership illusions our brain tries to minimize the “surprise” through the predictive coding scheme, when encountering a signal that was not predicted, it will generate prediction errors and will update the model in order to minimize the differences between the predictions about sensory data and the real sensory data at any level of the hierarchical model (e.g., synchronous visuo-tactile feedback in the RHI study…). Therefore, the subjects will update their internal body representation. (Matamala-Gomez et al. 2020, 4).
In other words, through this process of open communication, the system adjusts its hypothesis to accommodate the sensory experience: the rubber hand is part of my body.
Like any principle, however, open communication must allow for exceptions, or more precisely, according to PP, there is a flexibility built into the communication process. Different situations call for just this flexibility. There are circumstances in which it may be beneficial to allow priors to dominate and ignore prediction errors. Clark (2016) provides a good example. When driving on a foggy but familiar road one should rely on prior knowledge about how such conditions can affect our perception (Clark 2016). On the contrary, driving on an unfamiliar mountain road with sharp curves it would be wise to trust what our attuned senses are telling us even if they don’t match our expectations. “[V]ision needs to be flexible in the way that it deals with variations in context” (Ogilvie and Carruthers 2016, 725). This is a matter of adjusting the precision weights of specific hypotheses and prediction errors, since “the precise mix of top–down and bottom–up influence is not static or fixed. Instead, the weight given to sensory prediction error is varied according to how reliable (how noisy, certain, or uncertain) the signal is taken to be” (Clark 2016, 57).Footnote 6 Flexible precision-weighting allows for an “astonishingly fluid and context-responsive” system (Clark 2018, 523).
Precision-weighting requires the flexibility that Ogilvie and Carruthers mention and that Clark describes. This kind of flexibility, however, if it modulates the exchange of information based on reliability parameters, does not block communication entirely and cannot explain the persistence of illusions such as the MLI. It would be difficult to call it flexible and fluid if the communication flow is stopped. It would rather become a rigid system without the possibility of learning. If we accept the very basic idea that Bayesian inference operates with a variable learning rate, modulated by beliefs about precisions, the learning rate cannot reduce to zero, and there needs to be some degree of communication.
In the case of the RHI, to the extent that there continues to be some form of communication, even if constrained in terms of precision weights, one would think that reliability parameters would favor maintaining the original hypothesis based on one’s long-term body image, i.e., that the rubber hand is not my hand. Indeed, the subject, entering into the experiment, typically starts with that hypothesis, and at any point can access that hypothesis experientially quite simply and directly by closing her eyes. If perception is epistemically flexible, why doesn’t the adjustment, at least in the long term, go in that direction, i.e., in the direction of what I reliably know about my body, or have learned about rubber hands? Why don’t perception and cognition engage in effective exchange in order to minimize error, rather than preserving the sensory error and allowing it to rule? In light of the open communication principle, as Ogilvie and Carruthers (2016) put it, “it might seem mysterious why one’s belief should fail to modify the erroneous perceptual representation” (p. 726).
The explanation relies on the idea of a short circuiting by relevant priors operating at lower levels so that prediction errors are blocked from reaching higher levels of the system (Ogilvie and Carruthers 2016). The lack of ascending prediction errors can still count as information for higher levels, namely, the absence of an error signal in the communication process signifies that the stimulus is as predicted (although it isn’t); in effect there is no contradictory signal despite there being a contradiction in the system.
This resolution of the mystery, however, is not convincing, not only because the contradiction remains (that is, there is an assured belief in the system—“I know that this rubber hand is not my hand,” or in the case of the MLI, “I know that the lines are equal”—that contradicts the sensory signals), but also because it relies on the idea that the perceptual illusion is not ambiguous, i.e., not characterized by uncertainty.
We suggest that something like this [shortcircuiting] occurs when the visual system is processing depth and size information while one looks at a Müller-Lyer figure. As far as the early levels of processing are concerned, relative depth and size have been accurately calculated from unambiguous cues. Hence systems monitoring noise and error levels are being told that everything is in order: there is no need for further processing (Ogilvie and Carruthers 2016, 727).
One should think, however, that if perceptual illusions are unambiguous, then most other instances of perception should be unambiguous, and accordingly, there would never be call for higher-level PEM, since, as Ogilvie and Carruthers propose, when sensory input is sufficiently unambiguous the high-level priors need not come into play. In the case of perceptual illusions, at least, the high-level priors seemingly go silent. To borrow the message-passing language, they fail to send messages, or at least convincing messages. Or alternatively, the system gets the message but ignores the contradiction.
It is not at all clear, however, that in the case of perceptual illusions, our perceptions remain unambiguous, for two reasons. First, given what we know, that the lines are the same length in the MLI, or that the hand I see is not my hand in the RHI, there is no reason to think that this knowledge shouldn’t come into conflict with the sensory cues. Second, and this may be clearer in the RHI and the AHI, the perceptual ambiguity can be measured in terms of how surprising or unexpected the experience is. In the AHI, especially, the experience is totally unexpected and odd and remains so even when we know what the trick is.Footnote 7 The PEM short-circuit story denies the ambiguity, explains it away, when in fact PEM’s overarching explanation, where prior knowledge and sensory cues talk to one another (communicate, pass messages, trade in information), should predict the ambiguity that we do experience in these illusions.
Hohwy (2013) recognizes the challenge as he considers the MLI.
Even if we have a strong prior belief that they are of equal lengths, we still perceive one to be longer. So, the conscious experience here seems impenetrable: the higher-level belief fails to modulate perception. What can prediction error minimization say about this? (2013, 124-125).
At first it seems that Hohwy’s answer is different from that of Ogilvie and Carruthers since Hohwy affirms that the illusion provides viewers with an “ambiguous input” (p. 125). However, this ambiguity gets immediately resolved at the perceptual level because "the context provided by the wings [on the Müller-Lyer lines] trigger fairly low-level priors … [leading] to the inference that they are of different lengths rather than to the competing inference that they are of the same length” (2013, 125–126). In other words, the uncertainty or ambiguity is eliminated (short-circuited) early in visual processing in a low-level PEM process, “where the relevant priors occupy levels of the hierarchy within the early visual system itself” (125). Still, one wonders “why the higher-level prior belief in equal lengths cannot penetrate and create veridical experience of the lengths,” or why the higher-level prior belief does not correct the lower-order prior, or vice versa. The answer is that the “consequence of the early resolution to the ambiguity is that very little residual prediction error is shunted upwards in the system, and that therefore there is little work for any higher-level prior beliefs to do, including the true belief that the lines are of equal lengths” (126). If there is ambiguity in the initial sensory input, the system short-circuits it so that it never triggers higher-order priors. The question is never raised through the hierarchy to the level of the more general predictive model. Still, it is not clear why the question is not raised in a more circuitous way since we learn or are informed that the lines are in fact equal. It is not clear why this particular situation is not one in which the system realizes that “you cannot trust the signal from the world, but must arrive at a conclusion, so you rely on prior knowledge” (Hohwy 2013, 123). Indeed, the system seems to be in contradiction and it remains a puzzle why a system that is set up to decrease uncertainty by the open communication of information, does not do so when it has prior knowledge about the way the world is.
Even if there is early resolution of the ambiguity in the case of the MLI, in the RHI and the AHI ambiguity arguably persists. Hohwy (2015) discusses the RHI in terms of perceptual predictions. In the ambiguous case of the RHI, he suggests, the brain needs to determine which is the most probable—that the visual of the rubber hand is independent from the tactile situation of the real hand, or that there is a binding of the visual and the tactile and that you are experiencing the touch where you see it synchronously administered. Hohwy contends that the synchronicity is more expected on the binding hypothesis than on the independence hypothesis, and this leads to the illusion. In this case, the winning hypothesis is apparently on a higher level than the immediate sensory processes, but not high up enough to encounter a more certain hypothesis which, according to predictive processing, must also be represented in the brain, namely, that the rubber hand is not really part of my body, or that “the experimenter is the hidden cause of both the seen touch and the unseen touch on the real hand” (Gadsby and Hohwy 2019, 4).
One could also think of this in terms of intersensory precision and what happens proprioceptively.
The answer lies in the relative precision afforded to proprioceptive signals about the position of my arm and exteroceptive (visual and tactile) information—suggesting a synchronous common cause of my sensations. By experimental design, I am unable to elicit precise information about the position of my arm because I cannot move it and test hypotheses about where it is and how it is positioned. Conversely, by experimental design, the visual and tactile information is highly precise …. In other words, the precision of my arm position signals is much less than the precision of synchronous exteroceptive signals…. (Hohwy 2013, 107)
Vision, apparently, is the most precise, because it wins out over, and one might say, hijacks both proprioception and touch. It’s clear that the proprioceptive signal is weak since one is unable to move one’s tactilely stimulated hand during the experiment. Since proprioception is weak the predicted location of the touch takes its orders to march in synchrony with what vision says, and we experience the tactile stimulation in the rubber hand. In this case, the experimental set up, especially the synchrony of tactile and visual stimulation, introduces a bias into intersensory precision.
In the RHI Hohwy suggests that rather than complying with prior conceptual beliefs, there is a “suppression” of prediction errors—the system doesn’t eliminate them, it ignores them: “It is as if the perceptual system would rather explain away precise sensory input with a conceptually implausible model [this rubber hand is part of my body] than leave some precise sensory evidence (e.g., the synchrony) unexplained” (2013, 107). This still does not explain why our more plausible knowledge, that the rubber hand is not part of my body, fails to correct my experience. Importantly, perception involves a balancing act between prior expectations and sensory processes that are taken to have more or less precision. (1) If the system has high expectations for the precision of the sensory input, then that drives revision of hypotheses higher up the hierarchy. (2) If the expectation is that sensory input will be imprecise, it tends to be ignored, but then higher-order predictions will determine the perceptual inference (Hohwy 2013, 145). In either case the most plausible model should have a role to play. In the circumstance of the RHI, the latter (2) is more likely, since I know my real hand is under the blind; I know that the rubber hand is not my hand; and if I know how the experiment works, I should, at some level, expect the sensory input to be imprecise. Still the illusion works. One might also think that the experimentally imposed inability to move, or “cue ambiguity” when congruent visuo-tactile signals conflict with proprioception (Gadsby and Hohwy 2019), even if the proprioceptive signal is diminished, might signal low sensory precision and motivate a search for a higher-order resolution.Footnote 8 Indeed, this is what intersensory processes are designed to do, according to Hohwy. Granted, the conditional independence of the sensory systems, meant to increase overall precision (see Hohwy 2013, 143, 152ff, 251–252), is seemingly compromised if vision hijacks proprioception and touch. This means, at that level, there is a decline in precision that the system does not register. If this helps to explain the short circuit, there is still some further explanation required. We’ll come back to this issue.
In the case of the AHI, it is more difficult to suppress or explain away the sensory input. In this case, intersensory processes and the precision that they bring should be in good order since my hand is moving and generating kinaesthetic sensation. The kinaesthetic signal is strong, unlike the situation in the RHI, and there is consistent tactile pressure sense from holding the pencil and tracing the line; despite that, vision still hijacks the experience, but it leaves behind a strong ambiguity. The experience is not only surpising, but feels strange. The intersensory contradictions produce ambiguity rather than precision, and the ambiguity is never eliminated; if anything, there is an intensification of prediction errors that are hard to ignore. Yet my prior knowledge, that what I am seeing is not my hand, fails to correct or eliminate the illusion.