Memory-Prediction Errors and Their Consequences in Schizophrenia
- First Online:
- Cite this article as:
- Kraus, M.S., Keefe, R.S.E. & Krishnan, R.K.R. Neuropsychol Rev (2009) 19: 336. doi:10.1007/s11065-009-9106-1
- 269 Views
Cognitive deficits play a central role in the onset of schizophrenia. Cognitive impairment precedes the onset of psychosis in at least a subgroup of patients, and accounts for considerable dysfunction. Yet cognitive deficits as currently measured are not significantly related to hallucinations and delusions. Part of this counterintuitive absence of a relationship may be caused by the lack of an organizing principle of cognitive impairment in schizophrenia research. We review literature suggesting that a system of memory-based prediction is central to human perception, thought and action , and forward the notion that many of the symptoms of schizophrenia are a result of a failure of this system.
KeywordsMemory-predictionSchizophreniaCognitionHallucinationsDelusionsCortical circuitryPsychosisCognitive neuroscience
Impairments in memory-prediction underlying symptoms of schizophrenia
Nature of Prediction
Deficit in Schizophrenia
Predictions based on invariant memories lead to accurate, timely perception
Due to unreliable predictions, sensory analysis is divorced from context and relies more heavily on laborious bottom-up processes, or perception may conform to internally-based, inaccurate predictions.
Hallucinations and delusions
Patterns of sensory stimuli that have been consistently maintained are likely to continue
A decreased contextual window is exhibited by impairments in mismatch negativity (MMN), smooth pursuit and a tendency to jump to conclusions
Expected effects of one’s own thoughts and actions are incorporated into prediction of upcoming sensation
Effects of one’s own thoughts and actions are not accounted for in prediction of upcoming sensation
Semantic associations gleaned from past experience guide perception and thought
Semantic associations are weakened, leading to impaired language production and comprehension
Extensive past experience with facial and vocal analysis directs attention to features likely to indicate emotional state
Breakdown in memory-prediction function means individuals with schizophrenia are less likely to attend to effective indicators of emotional state
Social impairment, negative symptoms
Stable Perceptions of a Continually Changing World
The human nervous system continuously engages with a vast array of ambiguous and constantly changing, sensory signals. Using the visual system as an example, the two-dimensional visual image projected on the retina lacks sufficient information to unequivocally indicate the three-dimensional structure of the world, as varying combinations of object size, distance and orientation lead to the same retinal projection (Berkeley 1975). This ambiguous retinal image is in constant flux due to movements of the body, head and eye of the perceiver as well as actual movement of the objects in the scene. Simultaneously, ambiguous auditory, olfactory and somatosensory signals impinge upon the nervous system, requiring analysis. Further complicating interpretation, each encounter with objects and events is unique. The extraordinary difficulty the field of artificial intelligence has experienced in constructing systems capable of interpreting and interacting with real-world input in a timely manner (Dreyfus 1992) attests to the enormity of the task. Yet to a healthy human, interpretation of this welter of signals and interaction with the world it represents is typically effortless.
To organize coherent percepts from the multitude of fragmented and unstable sensory signals that constitute experience, and to allow for coordinated interaction with the world that these signals represent, the human nervous system must fill in the gaps of sensation, generalize across observations, prioritize stimuli for in-depth processing, and make immediate judgments regarding the meaning of sensory input. The efficient completion of these activities is a primary function of the human brain.
Perception has typically been viewed as the result of a bottom-up reconstruction of sensory input. It is not. Perceptual processes do not simply involve the reproduction of stimuli that impinge upon the sensory receptors, but instead involve systems of inferring meaning by matching fragmented sensory input to a construct that serves as a working model of the world (Helmholtz 1866). Richard Gregory referred to perception as “hypothesis testing” of the current sensory input against the brain’s best guess based on previous experience (Gregory 1980). This concept has recently been refined in the memory-prediction model of cortical function. This model posits that universal cortical algorithms serve to form representations and thus memories that emphasize the consistent properties of sensory and cognitive events. These algorithms predict the next moment of experience based on the confluence of these invariant memories and current context (Fig. 1). In this manner, our past experiences influence moment to moment perception, allowing regularities in past experience to fill in the missing gaps of imperfect sensation and to facilitate timely interaction with the world (Edelman 1989; A. W. Snyder 1998).
The Utility of Prediction
Unexpected events require longer periods of time to process and perceive. The common expression “It took a second to register,” is aptly demonstrated by the flash-lag effect, in which a stimulus moving at a constant speed along a predictable trajectory is perceived as having progressed beyond a stationary stimulus that suddenly and unpredictably flashes when the two stimuli are actually aligned (Fig. 2) (Nijhawan 1994). This flash-lag effect is reduced when the timing of the flash onset is controlled by the subject or is externally predicted by a tone played 300 ms prior to flash (Lopez-Moliner and Linares 2006). When an object blocks the path of a moving stimulus, the magnitude and direction of the displacement in perception reflects the expectation that the stimulus will “bounce” off the object (Hubbard and Bharucha 1988). Similar “flash-lag” effects have been reported for other visual parameters including luminance, color, spatial frequency (Sheth et al. 2000) and for auditory features including frequency sweeps and spatial motion (Alais and Burr 2003), suggesting that predictive strategies are employed to hasten processing across a wide variety of perceptual domains.
In addition to allowing for timely perception, automatic, unconscious prediction of the next moment of experience provides a mechanism for the efficient allocation of attention, whereby expected stimuli may be effectively ignored and attention reflexively directed to unexpected events (S. Grossberg 1982; Iaria et al. 2008; Mackintosh 1975). In a musical priming paradigm, subjects are asked to make speeded accuracy judgments on a perceptual feature of the target chord unrelated to the preceding harmonic structure, such as whether or not the chord was in-tune. When harmonically unrelated chords that violate expectation are presented, subjects display slower behavioral responses (Bharucha and Stoeckig 1986), increased P3b potentials (Regnault et al. 2001) and increased neural activation of inferior frontal cortex (Tillmann et al. 2003). Similarly, increased activation of inferior frontal cortex is seen with retrieval of semantic linkage between a cue word and a distantly related target compared to a closely related target (Wagner et al. 2001). In this manner, effective memory facilitates the efficient perception and analysis of relevant stimuli, allowing lower level neuronal systems to process expected perceptual stimuli, but automatically engaging upper level systems for additional processing of unexpected stimuli.
Invariant Memories Facilitate Prediction
In order to guide future perception, thought and action effectively, memory (and perhaps innate constructs (see Howe and Purves 2005)) is devoted primarily to encoding the essential elements of experience: aspects that are repeatedly experienced concurrently or in immediate succession. Although low-level visual properties (such as retinal size, color absolute pitch) may differ dramatically between experiences with a stimulus, higher order properties (such as relative size, relative color and relative pitch) remain more stable across exposures. As detailed below, many neurons in higher cortical areas are tuned to these “invariant” properties of sensation and memory of these invariant properties facilitates recognition (Biederman and Cooper 1991; Fiser and Biederman 2001). Because these invariant properties of experience are emphasized, memory is able to guide expectations through contextual associations that span both time and space(Fenske et al. 2006).
The capacity to form invariant memories is developmentally regulated and follows a different time course for different skills. Newborns and young children have awareness and recall of low-level sensory information beyond those of healthy adults. An inverse correlation has been observed between a child’s age and a measure of eidetic imagery (“photographic memory”) (Giray et al. 1976; Paine 1980; Richardson and Harris 1986). Similarly, newborn infants encode memory of sounds more in absolute pitch than do adults, who rely more heavily on relative pitch (Saffran and Griepentrog 2001). We speculate that repeated attention to low level sensory information at earlier ages facilitates the accuracy of the assumptions that an individual makes about the nature of perceptions later in life. However, the devotion to the details of sensory input seen in infancy and early childhood comes at the expense of perceptual constancy, which as detailed above, assists older individuals in efficiently processing a complex world of meaningful forms. Rare adults who maintain absolute pitch perception are impaired in their recognition of transposed melodies (Zatorre 2003). While these individuals display a remarkable ability to identify the frequency of a given pitch, they are greatly impaired in their ability to utilize the consistent intervals between pitches to maintain a consistent perception of a melody across musical keys, a skill typically of much greater value. Similarly, across other sensory domains, it is typically disadvantageous to retain the details of experience in memory. As discussed below, a universal function of cortex is to transform the raw, detailed information contained in initial sensory input into compact signals conveying essential meaning as deduced from previous experience.
Internal Model of the External World Effects Perception Through Prediction
Throughout the course of development and adult life, an internal model of the external world is built, consisting of the sum of a person’s invariant memories. This model is constantly updated to accommodate new sensory experiences, yet also configures interaction with the outside world to fit its pre-existing structure (Hawkins and Blakeslee 2004). As we interpret the external world, higher cortical areas are constantly comparing current circumstances to invariant memory stores to form predictions about the next moment of experience. These predictions set the stage for perception by priming cortical areas likely to be activated by bottom-up sensory signals. In this manner humans do not just perceive order in the world of chaotic signals; they impose it. Thus, perception, thought and action are biased towards re-experiencing the expected (Gregory 1968, 1980, 2001).
Wertheimer noted that minor deviations from well known forms are often suppressed in healthy humans, leading to perceptions that more closely reflect idealized forms than the actual objects (Wertheimer 1976). An especially impressive example of perception molded to fit prediction in this manner is seen in the rotating mask illusion (Gregory 1997). When a hollow mask is rotated and the front is no longer visible, one has the subjective experience of the forward-directed facial features suddenly popping out of the concave back of the mask. Over a lifetime, humans observe countless faces with convex features, and need to make rapid judgments regarding the meaning of those features to allow smooth social interactions. In the case where the facial features are inverted (hollow mask), the higher order processes of the visual system match sensory input with the pervasive past experience of normal faces, and the convex face is perceived even though one knows it should be concave.
The effect of past experience on current perception is widespread. In the visual domain, perception of brightness (Williams et al. 1998), color (Long and Purves 2003) and angles (Howe and Purves 2005) does not faithfully adhere to the actual stimuli, but instead reflects a weighted average of the probability distribution, inferred by past experience, of naturally occurring stimuli that may have resulted in the retinal image. Similarly, speech perception often corresponds closer to expectation set by previous experience than to the stimulus. Speech consists of a largely contiguous stream of auditory signals; the demarcations between the beginning and ending of words are frequently ambiguous in a purely acoustic analysis (Brent and Cartwright 1996; Christiansen et al. 1998; Klatt 1980). However, previous experience with the language guides lexical judgments that are reflected in perception of distinct words (Davis and Johnsrude 2007). Similarly, lexical context exhibits a strong influence on perception of missing (Warren 1970) or ambiguous (Ganong 1980) phonemic segments.
Memory-Prediction Model of Cortical Function (Common Cortical Algorithm to Extract Invariance and Propagate Predictions)
As discussed above, invariant representations and the predictions based on them have a profound and pervasive influence on human perception and thought. A recent extension of theories of cortical brain function postulates that common algorithms exist throughout the cortex that generate predictive signals and serve to form the memories that guide them (Friston 2005; Hawkins and Blakeslee 2004; Rao 1999). According to these concepts, the hierarchical structure (Felleman and Van Essen 1991) and consistent columnar architecture (Edelman and Mountcastle 1978) seen throughout the cortex serve to infer patterns from experience and these invariant memories form the context for predicting the next moment of experience.
Anatomists have long proposed that a canonical columnar circuit exists throughout the neocortex (Lorente de No 1949; Ramon y Cajal 1966), a hypothesis later corroborated by the discovery of columnar organization of functional architecture in somatosensory (Mountcastle 1957) and visual (Hubel and Wiesel 1972) cortex. More recently, refined anatomical techniques have revealed a common columnar circuitry across various cortical areas and across species (Douglas and Martin 2004). A simplified diagram of the excitatory components of this “canonical” cortical circuit is presented in Fig. 3. Briefly, the key components of the cortical circuit are:
Layer 4 receives thalamic input as well as input from layers 2 and 3 from regions of cortex lower in the hierarchy. The main projection of layer 4 neurons is to layers 2 and 3 cells within the column. In addition to interconnections with cells in the same layers in nearby columns, layers 2 and 3 neurons project to layer 4 in cortical areas further up the hierarchy.
Layer 6 neurons project divergently to layer 1 of lower regions of the cortical hierarchy, forming synapses predominantly with dendrites from layer 2, 3 and 5 neurons. Some of the neurons in layers 2, 3 and 5 project to layer 6 neurons in the same column. In turn, these layer 6 neurons project divergently to layer 1 of lower regions.
Neurons in layer 5 represent the output of the column. Pyramidal cells in layer 5 project to subcortical structures involved in action (such as basal ganglia) and also send collaterals to non-specific thalamus. Neurons in non-specific thalamus in turn project broadly back to the same area of cortex, thus completing the thalamocortical loop.
The memory-prediction model posits that the central algorithm of the cortical column serves to form predictions of impending bottom-up activation based on patterns of activity in top-down, horizontal and thalamic loop input to the column. Although the details are largely speculative, roughly, these inputs to the column represent: a broad view of the current situation (top-down inputs), the previous moment of experience (thalamic loop connections) and current context (horizontal connections). Synapses from these sources that are reliably activated prior to bottom-up activation are strengthened through Hebbian processes; this strengthening constitutes memory. Hierarchical Bayesian models based on these assumptions have demonstrated the ability to extract the statistical regularities inherent in sensory input and after training, produce robust visual pattern recognition performance despite distortion and changes to scale and translation (George and Hawkins 2005; Karklin and Lewicki 2005).
Eventually, these synapses are strengthened to the point that top-down and thalamocortical loop signals are able to drive activity in layers 2, 3 and 5, partially activating columns prior to full activation from bottom-up signals. The overlap in particular columns of top-down activation from invariant memories (the “hypothesis” of the larger picture being experienced) and thalamocortical activation conveying specific information about the last moment of experience constitutes a prediction regarding the next moment of experience. Neural network modeling has suggested that such sensitizing feedback is crucial for learning, memory and stabilizing perception (Stephen Grossberg 2000).
A concrete example of the generation of specific predictions based on invariant memory stores is outlined in Fig. 3. In this example, one readily recognizes a novel version of the song “Let it Be” played on a trumpet in a different key from the original song. This recognition occurs despite large differences in the low-level sensory activation generated by the novel versus familiar versions. It is hypothesized that within the cortical hierarchy processing pitch information lies an area of cortex with columns representing specific musical intervals. This area receives a top-down “hypothesis” regarding the identity of the musical phrase that is currently being experienced. In our example, the hypothesis might roughly correspond to “the beginning of Let it Be.” This area of cortex has previously learned the pattern of intervals to expect when this musical phrase is experienced (unison, unison, unison, ascending major 2nd, descending perfect 4th, ascending major 3 rd, unison, ascending perfect 4th, ascending major 2nd). Therefore, columns representing these tonal intervals are partially activated by the top-down hypothesis. Information regarding the last musical interval experienced is communicated broadly to this area of cortex by the thalamocortical loop circuit, partially activating columns corresponding to the next expected interval. The overlap at any given time of the top-down hypothesis (the “name” of the musical phrase) with the specifics of prior activation in that area (where we are within the phrase) forms a prediction. In this example, after hearing the first seven intervals that begin the melody of “Let it Be,” all columns corresponding to perfect fourths become partially activated in anticipation of bottom-up stimulation. Bottom-up stimulation indicating that a “C” was just heard arrives in layer four. The overlap of the predicted interval with the bottom-up information regarding the specific identity of the first tone in the interval generates a prediction regarding the next specific tone, which is communicated to lower level cortical levels via layer 6 neurons. Thus, this circuitry is capable of translating invariant memories into specific predictions based on current context.
The memory-prediction model posits that the nature of the output from a given area of cortex depends on familiarity with the patterns of bottom-up input it is receiving at any given time. When a person is in a new situation experiencing stimuli that do not clearly fit any top-down hypotheses derived from previous experience, a given area of cortex relays the details of its input to higher cortical areas. If this higher cortical area is unable to fit the incoming patterns to a memory-derived template, the details are in turn relayed to yet higher cortical areas. However, when the bottom-up signals are successfully predicted, the cortical area no longer transmits the details of the bottom-up signals it is receiving, but instead neurons from layers 2 and 3 transmit a steady signal, which serves as a “label” for the pattern of activity, to the next higher cortical area in the hierarchy. The next cortical area is then able to begin detecting higher level patterns in the sequence of “labels” received from below. In this way, as a stimulus is repeatedly re-experienced, representations of the details slide down the cortical hierarchy, leaving higher areas free to detect more abstract “patterns of patterns,” and then “patterns of patterns of patterns” and so on. With each level of abstraction obtained through repeated experience, the signal in the highest cortical area becomes more invariant to the particulars of the physical stimulus at any given time, i.e. changes in stimulus details are not reflected in the highest level representations of that stimulus. This build-up of invariance along the cortical hierarchy occurs for two reasons: predictive firing biases full columnar activation from below towards the columns predicted to be active and minor deviations from an expected sequence of patterns are not conveyed to higher cortical areas, but instead the “label” of the sequence most closely matching the observed sequence is transmitted. However, large deviations from expectation trigger alternate thalamocortical pathways, sending the details of input to higher cortical areas and automatically directing attention to the non-predicted event.
Prediction and Invariance Evident in Physiological Responses Throughout the Brain
Although details of how cortical circuitry supports memory-prediction function are largely speculative, physiological evidence suggests that prediction and the extraction of invariance are fundamental and ubiquitous processes of cortex. Predictive activation has been observed in several brain regions. Areas of the brain that are activated in pain perception associated with mild electrical shock of the forearm are activated when subjects are cued to anticipate the shock (Wager et al. 2004). Event related potential (ERP) findings suggest that healthy humans continuously predict upcoming words in a dialogue, using both context of the immediate discourse and a statistical knowledge of how words fit together (Van Berkum 2008). In the monkey dorsal visual pathway, visual cells in the lateral intra-parietal area fire predictively just prior to a saccade that will bring an object into their receptive field (Duhamel et al. 1992; Ito and Gilbert 1999). Further along the dorsal pathway, in the saccadic frontal eye field (FEF), approximately 20% of movement neurons fire in anticipation of a predictable signal to saccade into their movement field (Bruce and Goldberg 1985). Similarly, in the pursuit region of the FEF, neurons exhibit visual and motor anticipatory activity to predictable stimuli (Fukushima et al. 2002). Not only are motor areas involved in predicting upcoming self-generated movement, they also appear to underlie prediction of movement in others. Motor areas become activated when watching another person perform motor activity (Rizzolatti and Craighero 2004) and under favorable conditions this activity anticipates the other person’s movements (Kilner et al. 2004). This anticipation of another’s actions likely contributes to smooth social interactions (Schütz-Bosbach and Prinz 2007) and is obviously advantageous during physical competition or conflict.
Invariant representations have been most thoroughly documented in the ventral visual stream involved in pattern perception and object recognition. In more anterior regions of the ventral stream in monkeys, neurons that respond selectively to objects and faces demonstrate a high degree of invariance to low level features of visual input such as object size, retinal position, viewing angle, spatial frequency, luminance (Lueschow et al. 1994; Vogels and Biederman 2002), color and motion (Booth and Rolls 1998; Hasselmo et al. 1989; K. Tanaka et al. 1991; Tovee et al. 1994) However, these neurons display pronounced sensitivity to alterations of facial feature configuration (for example, the relative distances separating the two eyes, the mouth from the nose, etc.), a relative property that should remain consistent across viewings of the same face (Keiji Tanaka 1997) (Young and Yamane 1992). These neurons even retain shape selectivity when the object is partially occluded but recognizable (Kovacs et al. 1995). Thus, the activity of various components of visual processing systems appears consistent with the principles of the memory-prediction model and the expectation of invariant representations.
Object representation reaches its high levels of invariance in mediotemporal cortical structures, i.e. neuronal responses in this region are robust against low-level changes to stimuli. Neurons in the medial temporal lobe (hippocampus, amygdala, entorhinal cortex and parahippocampal gyrus) of humans respond specifically to complex visual stimuli including faces, objects and scenes (Fried et al. 1997). The responses of many of these neurons were found to be remarkably invariant to substantial changes in presentation of familiar stimuli (Quiroga et al. 2005). For instance, a neuron was found that responded exclusively to pictures of the actress Halle Berry. This response was not greatly effected by significant differences in superficial details of the stimulus. The neuron responded to pictures of Halle Berry as herself or dressed up as Catwoman (wearing a mask that covered much of her face); it responded to a drawing of Halle Berry and also to her typed name. The tuning to specific abstract stimulus properties seen in many medial temporal neurons in this study supports arguments that high-level invariant representations are emphasized in memory storage, thereby easing the burden on mediotemporal structures and allowing efficient manipulation of high level symbolic representations in cognitive processes (Barsalou 1999; James and Gauthier 2003; Logan 1988).
Although invariance has been most thoroughly researched in the ventral visual pathway, the construction of invariance appears to be a consistent feature of all sensory systems. Invariance to form (Zwickel et al. 2007) and motion (Rolls and Stringer 2007) is developed along the dorsal visual pathway as neurons in higher areas largely encode stimulus position irrespective of these features. Similarly, the auditory system of adult humans exhibits invariance to pitch identity and instead emphasizes pitch intervals. Unattended pairs of tones, played either simultaneously or in quick succession, elicit a mismatch negativity (MMN) response when the interval between the tones changes, but not when the absolute pitches change in such a way as to preserve the musical interval (Paavilainen et al. 1999), suggesting that some population of neurons in the auditory pathway automatically extracts invariant interval relationships from auditory input.
The behavioral and physiological evidence reviewed above suggest that the extraction of invariant properties from detailed input signals and subsequent prediction of the next moment of experience based on the confluence of these invariant memories with present circumstances are central processes crucial to the normal function of the human brain. Below we argue that disruption of these fundamental cortical processes underlie the phenomenological symptoms and cognitive deficits that characterize schizophrenia.
Perception and Cognition in Schizophrenia
The diagnosis and treatment of schizophrenia in recent history has focused primarily on the most striking and dangerous symptoms of the disorder. Clinicians and families are reasonably concerned when someone with schizophrenia experiences hallucinations, maintains bizarre delusions and engages in erratic, sometimes violent behavior. However, at the core of the disorder is a series of cognitive impairments (Heinrichs and Zakzanis 1998; Saykin et al. 1991) that are likely to exist prior to the onset of the acute psychotic episode (W. J. Brewer et al. 2005; Warrick J. Brewer et al. 2003; Davidson et al. 1999). While patients who are diagnosed with schizophrenia demonstrate a wide variety of cognitive impairments, it remains unclear which aspects of cognitive failure may be most strongly associated with the core of the psychotic illness. The surprising lack of correlation between the severity of cognitive impairment and the severity of delusions and hallucinations (Addington et al. 1991; R. S. Keefe et al. 2006), raises questions about current conceptualizations of cognitive impairment in schizophrenia. Furthermore, while cognitive impairment is viewed as a risk factor for schizophrenia, there is little current understanding about what aspects of cognitive dysfunction precede psychotic symptoms. Post-mortem cortical tissue from individuals with schizophrenia are characterized by widespread alterations (including decreased synaptic density and neuronal displacemetnt) of the canonical cortical circuit (Harrison 1999; Pantelis et al. 2003). Although care must be taken when interpreting post-mortem findings due to possible confounds including chronic illness, medication effects and comorbid conditions such as drug and alcohol abuse, decreased cortical volume and increased ventricular size in drug naïve, first episode patients provide evidence of anatomical abnormalities directly associated with schizophrenia (Janssen et al. 2008; Lim et al. 1996). We propose that many of the symptoms of schizophrenia are explained by this breakdown in the circuitry underlying the formation of invariant representations and the unfolding of predictions from these stored invariant memories.
Prediction of Current Moment Based on Invariant Memories is Impaired in Schizophrenia
One of the more robust physiological findings in schizophrenic patients is a decreased mismatch negativity (MMN) to auditory odd-ball stimuli (Umbricht and Krljes 2005). A MMN is elicited from healthy controls when a regular series of tones is interrupted by a tone that deviates in frequency, duration, intensity or inter-stimulus interval. The elicited MMN is largely unaffected by attention (Naatanen et al. 1993) and the amplitude of the MMN is inversely related to the deviant probability (i.e. the probability of the deviant stimulus occurring) (Pincze et al. 2002). Thus, the MMN appears to reflect a preattentive process of predicting upcoming sensory input based on regularities extracted from previous experience (Friston 2005) and the decreased MMN seen in individuals with schizophrenia suggests a deficit in these processes.
Another well known deficit observed in schizophrenia patients and their first degree relatives is a decrease in smooth pursuit gain and subsequent catch-up saccades to refoveate the target (Holzman et al. 1980). However, when the target travels at a pseudorandom gain (in which the speed of the target varies unpredictably as it travels across the screen), patients and controls perform similarly (Nkam et al. 2007). Strikingly, when a pursuit target changes directions unpredictably, schizophrenic patients maintain tracking better during the brief period around the change than do healthy controls, suggesting that tracking in schizophrenia patients relies more heavily on retinal error signals, while unaffected controls rely more heavily on extra retinal predictive mechanisms to set pursuit gain (Hong et al. 2005). Further supporting this hypothesis, schizophrenia patients exhibit decreased predictive pursuit relative to healthy controls when retinal error signals are eliminated by temporarily masking the pursuit target (Thaker et al. 1999; Thaker et al. 1998) or by stabilizing the target on the fovea (Hong et al. 2008).
The breakdown of memory-prediction function in schizophrenia is apparent in the difficulty individuals with schizophrenia display in interpreting suboptimal stimuli. In the Perceptual Closure task, fragmented images of common objects are presented in accordance with the ascending method of limits, from the most degraded to the most complete. Individuals with schizophrenia require more complete stimuli than healthy controls before they are able to identify the images (Cavezian et al. 2007; Doniger et al. 2001). This deficit allows enhanced performance compared to control subjects in a task in which subjects are shown degraded stimuli of known objects and later asked to draw the stimuli from memory as accurately as possible (S. Snyder 1961). Whereas healthy control subjects tended to fill in the “missing” portions of the figures, patients more accurately reproduced the figure in their drawings, suggesting a decreased influence of invariant memories from past experiences with the familiar objects and a relative weighting of low-level processing in perception and memory in schizophrenia.
Invariant Concept of a Face is Weakened
The analysis of faces is a particularly important function for human beings. A lifetime of experience analyzing faces predisposes healthy humans to perceive faces even from suboptimal stimuli, a tendency that is diminished in schizophrenic individuals. Schizophrenia patients demonstrate significant impairment in indicating the presence of a briefly flashed, degraded face in the Mooney faces task, but respond indistinguishably from controls in trials in which the face is vertically flipped and scrambled, indicating that the poor performance is not due to generalized visual processing deficits (Uhlhaas et al. 2006). Further, when schizophrenic patients are presented with a depth-inverted face stimulus, similar to the rotating mask described earlier, they are more likely to perceive the veridical “hollow” face than are control subjects (Emrich 1989), and scores on this binocular depth inversion test (BDIT) correlate with BPRS scores (Schneider et al. 2002), indicating a more direct relationship with symptomology than more standard tests of cognition. Although the decreased influence of past experience on current perception offers schizophrenic patients an advantage in the artificial conditions of the BDIT, in more typical scenarios this leads to impaired performance and eventually to the establishment of unusual ideas about fundamental aspects of reality. Schizophrenia patients exhibit difficulty identifying faces as their own, familiar or unfamiliar (Irani et al. 2006) and display deficits in facial recognition and face matching tasks (Archer et al. 1992; Kerr and Neale 1993). Superficial differences between photographs of identical faces, such as lighting conditions and visual angle, more greatly hamper patients’ ability to match a target face to a selection of six sample faces in the Benton task (Benton and Van Allen 1972; Blanchard and Neale 1994; Whittaker, et al. 2001), suggesting a relatively greater influence of lower level visual properties compared to invariant predictions in face processing in schizophrenia. These findings suggest that the natural arrangement of facial features that make up a face are not as tightly assembled in the memory of schizophrenic subjects relative to healthy controls; top-down processes exert less guidance of sensory input towards typical perceptions.
It is not surprising, in light of the profound perceptual deficits of faces, that schizophrenia patients exhibit impaired processing of emotional components of facial expressions (Archer et al. 1994; Harrington et al. 1989; Kee et al. 2006; Martin et al. 2005; F. Schneider et al. 2006). Deficits in emotional processing correlate with scores on the Span of Apprehension task in which subjects are instructed to indicate whether a T or F are present in a briefly flashed array of distractor letters, suggesting that a deficit in the ability to quickly extract perceptual information may underlie emotional processing impairments (Kee et al. 1998). Unlike healthy controls, when schizophrenia patients assign emotions to a facial stimulus, these judgments are not effected by upside-down presentation, suggesting a deficit in processing the configural relationship of facial components (Chambon et al. 2006). Schizophrenia patients are also impaired in predicting which facial features will be informative of emotional content as demonstrated by work on visual scan paths. Schizophrenia patients focus attention on individual facial components more than on configural relationships of these components (Loughland et al. 2002). Together, these results suggest that some of the impairment of emotional processing seen in schizophrenia is due to decreased influence of expectation to guide attention to key configural facial features for rapid extraction of emotional information.
Automatic Semantic Associations are Weakened
Due to a weakening of the quick and automatic predictions that are dependent upon the confluence of invariant memories and current circumstances, people with schizophrenia struggle more than usual to find meaning in their environment. John and Hemsley (1992) investigated the effect of semantic meaning on performance of a forced-choice task in which subjects indicated whether an image fragment had been contained in a previously briefly flashed image. Because semantically meaningful material tends to be more robustly encoded than non-meaningful material (for review: Craik 2002), the semantic stimuli tend to be easier to match to subsequent stimuli. Schizophrenia subjects in this study were able to use semantic meaning to improve their percentage of correct answers (although to a lesser extent than controls). However, for patients, the increased accuracy came at the cost of increased reaction time, while healthy controls exhibited decreased reaction time and improved accuracy in the semantic condition. This finding suggests that schizophrenia patients piece together semantic meaning using slower bottom-up processes, leaving little time for rehearsal, thus requiring longer deliberation during recall. Indeed, semantic meaning that effortlessly manifests when a scene is viewed by a healthy control subject appears to require deliberate consideration of the scene’s details for someone with schizophrenia to extract (Reich and Cutting 1982).
Similar deficits of automatic retrieval of semantic associations have been observed in language based tasks in schizophrenia. Although letter and category fluency tasks were designed to assess general verbal processing speeds, a meta-analysis found a differential deficit in semantic category fluency in schizophrenic patients (Bokat and Goldberg 2003). Subjects typically name more words on the category fluency task, presumably due to stronger connections in the lexicon due to past experience than due to shared starting letter. However, this tendency is diminished in schizophrenic individuals, suggesting a weakening of the connections between related items that comprise invariant memories. Similarly, schizophrenic patients display a decreased effect of semantic context in priming subordinate definitions of homonym pairs in tests of both implicit (Bazin et al. 2000; Cohen 1999) and explicit (Bazin et al. 2000) memory compared to healthy controls, perhaps due to a decrease in the influence of expectancy (Barch et al. 1996). Unlike healthy controls, schizophrenic patients do not exhibit a decrease in the N400 peaks to semantically primed words compared to unprimed words (Kiang et al. 2008) which suggests a hypoactivation of concept representations semantically related to the prime. In the patients, the amplitude of the N400 to a word following a related prime correlated with positive psychotic symptoms, consistent with the notion that deficits in memory-prediction underlie psychosis.
Because the semantic associations that comprise a person’s internal model of the world are weakened in schizophrenia, comprehension of language is labored. When reading or retelling a passage that conforms to typical expectations, schizophrenic patients exhibit pauses and verbal hesitations similar to those of healthy control subjects reading passages containing violations of typical presuppositions (Clemmer 1980). Further, in the reverse Cloze paradigm, in which subjects attempt to fill in missing gaps in a written passage subjected to deletion of every 5th word, schizophrenia patients are impaired in their ability to use the redundancy inherent in speech to produce appropriate responses (Newby 1998). These results suggest that impairments in the acquisition of verbal information in schizophrenia may be a result of weakened ability to form expectations regarding what information is about to become available.
The deficit in automatic retrieval of semantically related words seen in schizophrenia leads not only to impaired verbal comprehension, but also to disordered speech production. When schizophrenia patients were asked to discuss a given topic in the original Cloze paradigm, the degree of predictability in their speech (as assessed by the percentage of deleted words a rater was able to correctly identify) was significantly lower than that of healthy controls. The differences between the words schizophrenia patients spoke and the words raters chose to fill in the blanks were not simply synonymous substitutions, but instead had semantic implications (Newby 1998). Both the difficulties in verbal comprehension and unusual speech production suggest that either a top-down model of the world is diminished or abnormal in schizophrenia.
Memory-Prediction Failures Beget Conclusion-Jumping
Schizophrenic patients demonstrate an increased tendency to jump to conclusions in a task in which they must determine whether marbles are being drawn from one of two jars: one containing 85% black, 15% white, and the other vice-versa (Moritz and Woodward 2005) Because the progression of previously drawn marbles is visible, the task does not contain a working memory component. Patients generally reach conclusions with fewer marble draws and express more certainty than normal and psychiatric controls (Fear and Healy 1997; Garety et al. 1991). These characteristics tend to be more pronounced in delusional than non-delusional schizophrenic patients (Moritz and Woodward 2005). Additionally, when disconfirmatory evidence is presented, delusional patients over-adjust their assessment of which jar is being drawn from compared to normal and psychiatric controls as well as non-delusional schizophrenic patients (Garety et al. 1991; Moritz and Woodward 2005). Delusional patients also were the only group to fail to adjust their strategy when told the proportion of minority to majority colored marbles would be increased (Moritz and Woodward 2005). These findings suggest that schizophrenic patients, particularly those experiencing significant delusional symptoms, exhibit a bias towards recent events in their decision making process, consistent with a failure of past events to guide expectation.
Memory—Prediction Deficits Account for many of the Defined Cognitive Deficits In Schizophrenia
Impairment of memory-prediction function in schizophrenic patients may represent a unifying deficit underlying many of the more thoroughly defined cognitive deficits associated with the disease. Prediction based on previous experience serves to speed perception, thought and action by priming cortical columns expected to be driven by bottom-up input. Additionally, prediction serves to direct attention to sensory features of expected importance, or to stimuli that violate predictions. Because elements of the sensory world are less contextually integrated in schizophrenia, the predictive signal that cascades from higher to lower cortical regions is incomplete. More of the sensory world is unpredictable for someone with schizophrenia, the salience of items is developed more randomly, and sensations compete for limited attentional resources. Impairments of memory-prediction function may thus impair performance on tasks sensitive to processing speed and attention.
A breakdown in memory-prediction function may account for many of the clinical symptoms of schizophrenia (Table 1). Perhaps the memory-predictive process might be best viewed as the construction over time of a probability curve generator which continuosly produces probability curves describing the likelihood of particular events occurring in the upcoming moments of experience (Fig. 4). Theses processes are impaired in schizophrenia, resulting in flattened probability curves that less reliably guide perception towards accurate percepts. While individuals with schizophrenia are often able to accurately identify stimuli through labored bottom-up processes (John and Hemsley 1992), these require more extensive involvement of higher cortical areas for interpretation. This time and labor intensive strategy is not always practical. When real world processing demands are high, individuals may rely more heavily on timely but impaired predictive processes, leading to increases in inaccurate perceptions. When sensory information is especially lacking or difficult to interpret, misguided internal representations may overwhelm reality-based perception in the form of hallucinations.
Memory-prediction impairments may derive from genetic variation in multiple brain functions, or could also be associated with harsh environmental circumstances that fundamentally alter an individual’s memory-based predictive abilities. For example, the unexpected harsh violence of war time in an untrained individual may alter fundamental predictive perceptual capacities. Memories with strong emotional content are more robustly recalled (Hamann 2001), and trauma can have a damaging effect on memory-related neural structures including the hippocampus (Uno et al. 1989), thus limiting the capacity individuals to utilize memory-based perceptual functions. Psychosis has been reported to be more likely in times of trauma and severe emotional stress (Calhoun et al. 2007).
Since the real-world demand to process information marches on with the stream of time, a repeated inability to perceive correctly may lead to an accumulation of inaccurate but internally-meaningful perceptions that could then build upon one another into incorrect beliefs. This failed process may be at the core of the development of hallucinations and delusions. Context-based perceptions of real objects and real events may be reduced in favor of an interpretation of reality that is individually determined and disconnected from experiences and beliefs shared by others. Even when stimuli are successfully identified through largely bottom-up processes, a mismatch between predicted and actual sensory input may lead to emotional arousal and to abnormal salience of events and objects (Anscombe 1987; Kapur 2003), as they will be experienced as disconnected from a relevant context. A desire to make sense of these emotionally salient events may lead an individual with schizophrenia to link them to temporally-related mental events and arbitrary interpretations. This tendency to ascribe meaning where none is present has been found to be one of the only significant risk factors in conversion to psychosis in an at-risk population (Hoffman et al. 2007). The shrunken contextual window and tendency to jump to conclusions seen in schizophrenia may contribute to the robustness of delusions in the face of subsequent disconfirmatory evidence and reasoned arguments (Van Dael et al. 2006).
Repeated experiences of events disconnected from a relevant context may lead to an overall conclusion that reality operates in a manner such that there is external management of basic perceptual processes. This misinterpretation of the mechanisms of thought and perception, and the resultant symptoms have been described by various schizophrenia phenomenologists, most notably Kurt Schneider, whose “first-rank symptoms” include delusional experiences such as thought insertion and passivity experiences and specific hallucinatory experiences (K. Schneider 1959). As an example, if a motor command to move one’s arm does not generate the prediction that one’s arm will move, someone with schizophrenia may believe that an outside force is responsible. A similar lack of agency of mental events leads to interpretation of thought insertion and withdrawal. These depersonalization symptoms have been reported to precede a formal diagnosis of schizophrenia and may be predictive of conversion to psychosis (Klosterkotter et al. 2001; Parnas and Handest 2003).
The breakdown of the functioning of the memory-prediction system would likely lead to several symptoms with an impact on interpersonal relationships. Much of social interaction involves the interplay of internally-based viewpoints and commonly shared experience and belief. If the ability to develop accurate percepts and a typical internal model of the world is disrupted, one of the most fundamental expected changes would be difficulty engaging in social interaction. Further, memory-prediction failures may result in the loosened semantic associations underlying thought disorder, which greatly impairs meaningful conversation. Finally, impairment in the prediction of which facial and vocal features convey emotional state may contribute to deficits in processing of emotional information and to inappropriate affect. The inability to engage with other people in a manner that is mutually rewarding is at the core of the functional disability of people with schizophrenia, leading to various difficulties such as extreme social isolation and very high rates of unemployment. This social isolation and reduced necessity for utilizing cognitive skills may further alienate individuals with schizophrenia, and may be associated with the chronic course of the illness.
A key initial test of our hypothesis that memory-prediction errors underlie many of the clinical features of schizophrenia is the extent to which measures of memory-prediction function correlate with clinical symptoms. Currently the studies addressing this question are limited and the results are mixed. As discussed above, the amplitude of the N400 elicited by a word following a priming word correlate with positive symptoms, BDIT scores correlate with BPRS scores and a Jumping to Conclusions bias was predominantly seen in patients with high scores on the Peters Delusion Inventory. However a meta analysis of twenty-two MMN studies in schizophrenic populations indicated that the majority of studies did not demonstrate a correlation between MMN and symptomology (Umbricht and Krljes 2005). Slaghuis et al. (2005) found that smooth pursuit eye movement deficits correlated only with negative symptoms, while Holahan and O’Driscoll (2005) found a correlation with positive symptoms as well. Hong et al (2003) have suggested that positive symptoms are associated with deficits in predictive smooth pursuit while initiation abnormalities may be associated with negative symptoms. Studies of face processing and facial emotion recognition have found conflicting relationships to symptomology (Marwick and Hall 2008), although there is some evidence that facial identification and facial memory ability may correlate more closely with negative symptoms(Martin et al. 2005; Sachs et al. 2004).
A particularly complex relationship between memory-prediction function and clinical symptoms was found in an associative learning task in which subjects attempted to figure out which foods would cause an allergic reaction if eaten by “Mr. X” (Corlett et al. 2007). Following training, violations of expectation were introduced and consistently confirmed over subsequent trials. During the test phase, behavioral responses of patients with first episode psychosis did not differ from healthy controls, nor did behavioral performance correlate with measures of clinical symptoms. However, imaging revealed decreased activation of right prefrontal cortex (rPFC) in patients compared to healthy controls when expectations were violated as well as increased rPFC activation to predictable outcomes. The magnitude of the difference in rPFC activation to unexpected and predicted events was significantly correlated with delusional scores (current level of unusual thought content on BPRS). These findings suggest that in some experimental paradigms, deficits in memory-prediction may be overcome by compensatory mechanisms or that, as the authors suggest, deviation from prediction may be registered normally, but the process of making inferences from this error in prediction may be abnormal, contributing to formation of delusions. Clearly, more work needs to be done to more fully elucidate the relationship between memory-prediction errors and clinical symptoms.
Our hypothesis that memory-prediction errors are more proximal to the underlying neurobiology of schizophrenia than the prodromal symptoms they engender suggests that tests of memory-prediction may be especially sensitive predictors of individuals who go on to develop psychosis. We have recently outlined a battery of crucial tests of memory-prediction suitable for use in a high-risk population (R. S. Keefe and Kraus 2009) In addition to the Binocular Depth Inversion and Perceptual Closure tests described earlier in this paper, the battery includes the Learned Irrelevance and Spurious Messages from Noise tests.
In the learned irrelevance test, a series of letters is flashed on the computer screen at the rate of 1 per second and subjects are instructed to press a button as quickly as they can whenever an X appears. The main analysis involves comparing reaction times to the target in blocks in which a novel letter (not seen in previous blocks of the test) predictably precedes the target letter to blocks in which a pre-exposed letter (that in previous blocks was not related to presentation of the X) predictably precedes the target letter. Thus, the learned irrelevance test is similar to the latent inhibition paradigm, but is amenable to repeated measurements in a within-patient design (A. Orosz et al. 2007). Individuals with schizophrenia have exhibited less interference of previous exposure on subsequent implicit learning (Gal et al. 2005; A. T. Orosz, et al. 2008), but to our knowledge, this test has not been used in populations at high risk of developing psychosis.
The Spurious Messages from Noise test does not directly assess memory-prediction function, but instead measures the propensity to fill in the gaps of experience with internally-generated thoughts that then color perception, a process we hypothesize is over active in individuals with impaired memory-prediction function. In this test, subjects are presented with an auditory stimulus consisting of 12 independent streams of speech and asked to repeat any words they hear. A small study of 43 individuals at high-risk of developing psychosis found that in the individuals who did not receive antipsychotic medication, the length of the longest phrase generated when repeating “heard” words was strongly associated with subsequent conversion (HR = 1.78, 95% CI 1.26–2.53, P = .001) (Hoffman et al. 2007). However, the number of converters in the no-drug group was nine, thus replication of this study in a larger sample is needed.
To our knowledge, there have been no treatments designed specifically for memory-prediction impairments. Numerous programs in academia and industry that aim to develop pharmacologic compounds targeting memory impairments and other cognitive deficits in patients with schizophrenia are currently underway (www.clinicaltrials.gov, accessed May 19, 2009). In addition, cognitive remediation therapies appear to have a modest beneficial effect on cognitive ability and related functional outcomes (McGurk et al. 2007). While this work may lead to pharmacologic and behavioral interventions that benefit cognition in general and improve global functional outcomes such as occupational and social functioning, these efforts tell us little about the potential impact of treatments that would specifically target memory-prediction functions.
Although untested, it is largely assumed that improvement in very specific cognitive abilities in patients with schizophrenia will provide little benefit unless global cognitive outcomes are also improved (R. S. E. Keefe and Harvey 2008). The most promising phase of specific cognitive interventions may be prior to the onset of psychosis. However, this work has largely been hampered by the limited ability of clinicians and researchers to predict which young people will develop psychosis, with specificity estimates of about 20% of subjects converting to psychosis within 1 year (Yung et al. 2003). Thus, any treatment used to prevent psychosis must be sufficiently benign to justify that a majority of treated subjects would never have developed a psychotic illness if they had not been treated (Corcoran et al. 2005; Lencz et al. 2003). If memory-prediction impairments do indeed underlie psychosis and are present prior to psychosis, it may be worthwhile to develop interventions that specifically target memory-prediction errors. Compounds that improve hippocampal function, neurogeneration, and the efficiency of cortico-cortical communication would be top priorities (Boyer et al. 2007), but these treatments would need to be virtually free of side effects to justify their use in at-risk populations. Behavioral interventions may hold more promise. Cognitive-behavioral treatments that address the process by which memories are developed, the processes in which memory-based perceptions are formed, and the ability to apply contextual cues to perceive objects (including humans) and the relationship among them may be particularly effective at improving memory-prediction functions. Non-specific cognitive-behavioral therapies have demonstrated promise in preventing imminent psychosis (Morrison et al. 2004), and could be adapted to address specific memory-prediction impairments.
Summary and Conclusion
Human perception is greatly influenced by past experience. Throughout life, an internal model emphasizing the invariant features of the external world is built up and continually refined. The memory-prediction model posits that the canonical columnar circuit serves to extract invariance from sensory input and to make predictions based on invariant memories and current circumstances. Indeed prediction and invariance have been observed in the physiological responses of neuronal populations throughout the cortex. Although we are typically unaware, unconscious predictions based on invariant memories profoundly influence our perception and thought. These constant predictions optimize perception by: 1. filling in missing gaps in ambiguous sensory input 2. speeding perceptual and cognitive processing 3. strategically and automatically allocating attention.
Because of a breakdown in memory-prediction processes, individuals with schizophrenia are not able to interpret and interact with the outside world with the ease and efficiency characteristic of healthy individuals. Findings outlined above suggest that individuals with schizophrenia are impaired in their ability to use learned regularities of experience to efficiently direct attention, guide behavior, assist with identification of suboptimal stimuli and utilize semantic context. This breakdown may represent the primary cognitive deficit that leads to the clinical symptoms associated with schizophrenia. The frequent mismatch between prediction and sensation is likely to lead to atypical salience of the elements of experience and unusual ideas about reality. Although individuals may be able to partially compensate for impaired memory-prediction function through bottom-up analysis, these time-consuming approaches are likely to be abandoned in times of high stress, and the greater reliance on impaired predictive processes renders individuals with schizophrenia more vulnerable to misperceptions and hallucinations.
If a breakdown in memory-prediction function constitutes the primary cognitive impairment underlying the development of schizophrenia, cognitive tests that target these processes may be sensitive to changes in cognition early in disease progression. Indeed, a population of individuals meeting prodromal criteria demonstrated less perceptual inversion when presented with ‘hollow’ versions of common objects in the significantly Binocular Depth Inversion Illusion Test (Koethe et al. 2006). This test, along with others that require the use of past regularities of experience to guide behavioral responses may be especially good predictors of conversion to psychosis in high-risk individuals.
There was no funding source for this paper.
The authors have no financial conflicts of interest related to this paper.