Imagination is more important than knowledge. For knowledge is limited to all we now know and understand, while imagination embraces the entire world, and all there ever will be to know and understand.

—Albert Einstein

Anyone who’s watched young children play has witnessed the vast power imagination holds in our early years. Far from being mere childish distraction, imagination is crucial to the development of advanced cognition, problem solving, and social empathy (Buchsbaum, Bridgers, Weisberg, & Gopnik, 2012). Increasingly, child-development experts are recognizing the importance of imagination and the role it plays in understanding reality. Furthermore, all types of play, including imaginary, help guide human brain development and shape human intelligence (Blaisdell, 2015). But fantasy isn’t just for kids. Adults use imagination for diverse activities, ranging from artistic expression—such as writing fiction, painting, sculpting, and music—to philosophy, science, and engineering. Imagination serves the basis for our ability to reason counterfactually about “what ifs” and “if only I hads.” It motivates us to search for hidden causes, whether it be a doctor trying to diagnose the cause of a patient’s symptoms, a scientist’s search for hidden causes to explain observations, or a child trying to understand what prevents a simple wooden block from standing on its side (Povinelli & Dunphy-Lelii, 2001). This ability is so advanced in humans that it has not prevented us from trying to grapple with concepts that are impossible to imagine, such as infinity, imaginary numbers, and wave-particle duality in quantum physics. Even the subjective experience of another species may be beyond our scientific ken (Nagel, 1974), yet is a thriving scientific pursuit of comparative cognition research.

Despite Nagel’s (1974) pessimistic essay, other philosophers have picked up the challenge. “Suppose you imagine something novel—I hereby invite you to imagine a man climbing up a rope with a plastic garbage-pail over his head. An easy mental task for you. Could a chimpanzee do the same thing in her mind’s eye? I wonder” (Dennett, 1995, p. 372). Dennett raises a fascinating question at the heart of comparative cognition: Do animals have minds that work the way ours do? Why would the study of imagination in animals be of interest? An important goal of comparative psychology and cognition research is to understand the origins, evolution, development, and function of behavioral traits and cognitive processes. By investigating the cognitive processes that nonhumans share with humans, we can better understand our own human uniqueness. Despite its critics (Shettleworth, 2010), an anthropocentric approach of looking for evidence in nonhuman species of cognitive functions found in humans can be profitably used to interrogate the content and processes of the animal mind (Burghardt, 2006; de Waal, 1999; Silk, 2016).

While the investigation of true imagination in animals may be premature, there has been some progress in the study of the cognitive process from which imagination is built—mental imagery. First, we must define mental imagery in a way that allows it to be studied in humans and nonhuman animals alike. Mental imagery is the ability to maintain an active representation of the sensory/perceptual details of an event or object in the absence of actual sensory input from the physical event or object. That is, it refers “to representations and the accompanying experience of sensory information without a direct external stimulus” (Pearson, Naselaris, Holmes, & Kosslyn, 2015, p. 590). These representations presumably use many of the same neural processes involved in sensation and perception. As Shepard (1984) cogently put it, “(a) imagining, like perceiving, is surely performed by physical processes in the brain but (b) we do not need to know any details of these processes in order to study imagining. . . . What we imagine, as much as what we perceive, are external objects; although in imagining, these objects may be absent or even nonexistent.” (Shepard, 1984, p. 420). And later on, “Properly speaking, our experience is of the external thing represented by those brain processes, not of the brain processes themselves. At the same time, by acknowledging that perceiving and imagining—as well as remembering, planning, thinking, dreaming, and hallucinating—do correspond to brain processes, we at least open the door to possible connections with evolutionary biology, clinical neurology, and artificial intelligence (Shepard, 1984, p. 421).

We must also distinguish between mental imagery and other processes by which animals (and humans) can represent absent but previously presented events. Such processes include expectancies, representations of space, time, and causality (often referred to as cognitive maps: Blaisdell, 2009), propositional knowledge, beliefs, and schemas. These distinctions will be clarified throughout the article. In this review, I propose that some animals can show evidence of having and using mental images.

There are two ways that a mental image could become active. Perception of an object or event could persist after it no longer impinges on the sensory apparatus. For example, when you look at the scene in front of you and then close your eyes, you still have an impression of the scene held in your mind’s eye, so to speak. Likewise, a salient sound, say a spoken word or piece of music, can persist as an auditory impression after the actual sound has ceased. These images tend to be fleeting, lasting only a few seconds, though the phenomenon of the earworm shows that they may remain for much longer durations, on the order of days, and can be tenaciously persistent (Halpern & Bartlett, 2011; Williamson, Liikkanen, Jakubowski, & Stewart, 2014). Sensory registers and iconic memory have been proposed as processes by which the maintenance of recent sensations remain perceptually present for a short duration after their termination (Coltheart, 1980). These processes presumably underlie the cognitive abilities responsible for object permanence (Blaisdell et al., 2009).Footnote 1

A second route to an active image in the mind is through associative retrieval. When you hear the voice of a familiar person, it may retrieve a visual image of the person. Likewise, language contains words that are visual (written, sign language) and auditory (spoken) tokens for real objects. Perceiving the object-word can retrieve in the mind the perceptual attributes of the object as an image, even when instructed not to. For example, try not to think of a white bear (Wegner & Schneider, 2003). Verbal prompts, like the example by Dennett (1995) above also use associative processes to retrieve the referents (e.g., “man,” “climb,” “rope,” “pail,” “head”) so that they can be subsequently combined according to the propositional instruction to “imagine” an event containing the referents articulated in a specific manner (e.g., “pail on head”) and undergoing a specific set of actions (e.g., “climbing a rope”). This involves not the retrieval of a previously experienced episode (i.e., episodic memory), but the construction of a novel imagined event.

Psychologists and neuroscientists have developed methods to study imagery and imagination in humans (Pearson & Kosslyn, 2013). Pearson et al. (2015) summarize the current state of knowledge about visual mental imagery in humans as follows:

Recent work has demonstrated how imagery can “stand in” for an afferent visual representation of an external stimulus. Specifically, mental images seem to behave much like weak versions of externally triggered perceptual representations. Functional brain imaging work supports the behavioral evidence by demonstrating that common sets of neural structures are employed during both events. Further, both representations seem to be encoded using a common set of basic visual features, which in many visual areas are organized topographically. (Pearson et al., 2015, p. 599).

These same methods have also been applied to the study of imagery in animals. The remainder of this article reviews the literature on mental imagery in animals, focusing specifically on visual imagery, which has received the most attention (though learning of song by songbirds uses a template encoding and matching process that could be likened to an auditory image (Bolhuis & Moorman, 2015).

The concept of mental imagery is not without its critics (Pylyshyn, 1973, 2003; Tonneau, 2013). Pylyshyn (1973) proposes a propositional account in which imagery, to the extent that we report experiencing it, is epiphenomenal and not causally related to behavior and cognition. Instead, according to Pylyshyn, images and other thoughts “in the mind” are manifestations of a propositional representation system. Nevertheless, many of Pylyshyn’s critiques apply specifically to the “pictures in the head” view of mental imagery, and to the fact that mental imagery research in humans often employs instructions that guide the participant to use the imagination process, and thus may be biasing the use of psychological processes that wouldn’t otherwise be engaged. Neither of these criticisms applies to research with nonhuman animals—the latter for obvious reasons, and the former because scientists who investigate mental imagery in animals are not arguing for a pictures-in-the-head account, but rather, as we shall see below, that imagery involves the contribution of sensory and perceptual features to ongoing behavior. The fact that nonhuman animals cannot be biased by the verbal instructions of the researcher makes them particularly useful and important for understanding the mechanisms that govern representational processes, such as mental imagery.

Working memory tasks and mental imagery

The first attempts to demonstrate mental imagery in animals were based on the seminal studies of mental rotation in humans by Shepard and Metzler (1971), using a working memory task. Shepard and Metzler measured the time it took human participants to solve simultaneous discriminations of images of 3-D objects versus their mirror images. In a matching-to-sample (MTS) procedure, participants were shown a sample object and two comparison objects, one that was the same as the sample and the other its mirror image. Comparison objects could be rotated relative to the sample. Reaction times (RTs) were longer the greater the angular disparity (degree of rotation) of the comparisons relative to the sample. This suggested that human participants were mentally rotating the objects so that the sample and comparisons were oriented the same way. Moreover, just like physical rotation, mental rotation was an analog process, where the greater the distance of rotation, the longer it takes to complete the mental rotation.

Hollard and Delius (1982) tested pigeons using an analogous procedure. Subjects were first presented with a sample item consisting of a complex 2-D shape (see Fig. 1a–b). After the sample was pecked 15 times, a pair of comparison items were presented to the left and right of the sample. One comparison was identical to the sample, and the other was a mirror image of the sample. Some pigeons were reinforced for choosing the identical item, and others were reinforced for choosing the mirror image. After subjects acquired this procedure to a high level of accuracy, they were tested with comparisons presented at various degrees of rotation relative to the sample (see Fig. 1c). Humans were also trained and tested on the same stimulus set, and showed the classic effect of slower RTs as a function of the degree of angular disparity between the sample and comparisons. Surprisingly, the pigeons did not show RT differences, though they did show poorer accuracy at rotations of maximal disparity from the sample (see Fig. 1d–e). Hollard and Delius concluded that pigeons processed the stimuli more efficiently than humans did, possibly due to differences in ecological demands between humans and pigeons.

Fig. 1
figure 1

Panels a and b depict the test apparatus and stimuli, respectively used by Hollard and Delius (1982) to train and test pigeons on mental rotation. Panel c shows six examples of trials where the correct answer was left-right counterbalanced, with two examples each at 0°, 45°, and 180° rotations between sample and comparisons. Panels d and e show reaction times (s) and errors (%), respectively, for both humans and pigeons (with individual pigeon data identified by subject number) tested by Hollard and Delius (1982)

These results have not gone unchallenged. A similar study using a successive discrimination (aka go/no-go) procedure observed greater response latencies and lower discrimination ratios when training stimuli were presented at increasingly greater degrees of rotation (Hamm, Matheson, & Honig, 1997). Thus, pigeons did not display rotational invariance as reported by Hollard and Delius (1982). The authors suggest that mental rotation is one possible interpretation of this effect, though they considered other possibilities as well. Importantly, the results call into question the species difference between pigeons and people as suggested by Hollard and Delius (1982).

Neiworth and Rilling (1987) also conducted a follow-up to the study by Hollard and Delius (1982), but with a twist. They presented pigeons with an image of a clock-hand stimulus that rotated from an initial location of 0°, with the minute hand at the 12 o’clock position at the start of each trial (see Fig. 2). After the start of the trial, the clock hand began to rotate clockwise at a constant velocity within the face of the clock. On perceptual trials, the clock hand was always visible until it reached a target location at the end of the trial. On imagery trials, however, the clock hand disappeared from the screen at the 90° position (i.e., at 3 o’clock). The clock hand then reappeared at a target location after a specific delay as if it had continued to rotate with constant velocity during the delay. The clock hand also disappeared at the 90° position on violation trials, but then reappeared after a delay at a position inconsistent with constant velocity during the delay. Below the display were left and right response keys. Pigeons were reinforced for pecking one of the keys (e.g., left) at the end of perceptual and imagery trials, and for pecking the other key (e.g., right) at the end of violation trials. Thus, pigeons could receive reward on every trial if they pecked the correct key depending on the type of display. After sufficient training, pigeons showed successful transfer of the discrimination to displays involving a novel intermediate location (158°) and to a novel location outside the boundaries trained (202°). This interpolated and extrapolated transfer, respectively, suggests pigeons did maintain a representation of a constant velocity clock hand during the delay period on imagery and violation trials. Of course, the representation does not necessarily consist of an active image in “the mind” of the pigeon, and instead could be result from a nonimagery-based computational process that calculates expected location of the invisible clock hand as a function of time. We return to this issue of how to assess true imagery from similar nonimaginal processes toward the end of this article.

Fig. 2
figure 2

Stimuli used by Neiworth and Rilling (1987) on perceptual, imagery, and violation trial types for 135° training and 180° training and for the test of transfer at 158°. (The solid lines drawn in the circles represent the start location [0°], the point of disappearance [90°] and the point of reappearance, or the location at which the clock hand stopped moving [135°, 158°, or 180°]. The solid arcs indicate visible movement of the clock hand. The dashed arcs indicate where the clock hand would have moved, assuming constant velocity. The clock hand is not visibly present while rotating within the space represented by the dashed arcs)

Associative processes of mental imagery

Humans and other animals have been shown to learn about the world using associative and causal learning. Associative learning forms the basis of prediction and skill learning, whereas causal learning forms the basis of causal knowledge and inference (Blaisdell, 2009). Both types of learning depend on the ability to detect and encode relations between events in the world. But oftentimes, especially for causal relations, not all of the relevant information is perceptually available. How is an agent to make optimal and rational decisions in the face of missing information? One solution is to infer hidden causes. For example, when a doctor observes nasal congestion, red and watery eyes, swollen lymph nodes, and a cough, he or she can diagnose a viral infection as the probable cause. When an infant observes a bean bag being tossed from behind a screen, he or she acts surprised if the screen is subsequently removed to find nobody there (Saxe, Tzelnic, & Carey, 2007). These examples demonstrate the capacity in humans, even very young children, to draw inferences about hidden causes from patterns of observed statistical associations among events (Blaisdell, 2017; Gopnik et al., 2004; Hagmayer & Waldmann, 2004; Kushnir, Gopnik, & Lucas, 2010; Waldmann, Hagmayer, & Blaisdell, 2006).

The existence of such capacities in humans raises the question, are nonhuman animals also sensitive to the ambiguity induced by missing information about the environment? If so, how do nonhuman animals make decisions in the face of missing information?

In organizing this section, I have been guided by the following thesis.

  1. 1.

    We take as our starting point the notion that the conditioning process involves forming associations between representations of events, such as sensations and actions.

  2. 2.

    These associations allow for the presentation of one event to retrieve representations of associated events.

  3. 3.

    If the retrieved associated event involves sensory components, these can be experienced as memories, and even as reperceptions (images).

  4. 4.

    Thus, an “image” can be associatively evoked (cf. Kosslyn, 2005).

An intriguing early example of the role of associative processes in the control of imagery comes from a study involving hypnosis of human subjects (Leuba, 1940). The experiment took place as part of a classroom exercise. The instructor placed various students under deep hypnosis, and then paired two stimuli, one auditory or visual stimulus to serve as the conditioned stimulus (CS) and the other stimulus involving a sensation (such as visual or tactile) to serve as the unconditioned stimulus (US). After the student was taken out of hypnosis, the instructor then began to make various sounds, until the sound that served as the CS was presented. At this point, the student would report experiencing the sensations invoked by the US as if they were actually experiencing it. An example from the study demonstrates the phenomenon:

E pricked S about eight times with an algesiometer [a device that measures skin sensitivity to pain] on the fleshy part of the right hand between the base of the thumb and index finger while tapping the top of a small can with a pencil. On awakening, S was requested to report whenever he saw, felt or otherwise experienced anything in connection with a series of stimuli. E stamped on the floor, rattled a brief case and so on finally tapping the can with a pencil. S at once scratched the previously stimulated area on his right hand with the left one and said that “it smarts and itches.” He stopped scratching as soon as E stopped tapping the can and started to rub again as soon as the tapping was resumed. (Leuba, 1940, p. 348).

This example demonstrates the power of associative processes to elicit sensory experiences, at least in humans. By the 1960s, a considerable body of evidence had accumulated for the role of conditioning processes to evoke imagery and sensory experiences, including evoking hallucinations in human subjects (Konorski, 1967; Mowrer, 1960; Perky, 1910).

While the role of associative processes in eliciting imagery in humans is now well established and appears to play a role in hallucinations (Powers, Mathys, & Corlett, 2017) and in clinical settings (Dadds, Bovbjerg, Redd, & Cutmore, 1997), the issue in nonhuman animals is still relatively unexplored. Konorski (1967) discussed the role of associative learning in eliciting images (“gnostic” units) through associative retrieval processes. Early studies of short-term memory in pigeons also suggest the retrieval of an image-like representation in tasks that involve prospective coding. These studies used the delayed symbolic matching-to-sample (DSMTS) procedure, in which one type of stimulus (e.g., colored key lights) can serve as a sample and another type of stimulus (e.g., black lines on a white background presented at various degrees of rotation) can serve as the comparisons. Cumming and Berryman (1965) suggested that, upon observing the sample stimulus, the pigeon could learn to translate the sample into a form isomorphic to the correct comparison stimulus. This image-like prospective code would be maintained in working memory after the sample terminated and before the onset of the comparison stimuli, thereby allowing the subject to choose the correct comparison after the delay.

Evidence supporting this hypothesis was found in a clever experiment by Herb Roitblat (1980, Experiment 3). Three color stimuli and three line-orientation stimuli served as samples and comparisons, respectively, for two birds, or as comparisons and samples, respectively, for a third bird. Two of the colors, orange and red, were quite similar and distinct from the third color, which was blue. Two of the line orientations, vertical (0°) and slant (12.5°) were quite similar, and distinct from the third line orientation, horizontal (90°). On each trial, one sample was presented, followed by a variable-length delay during which the sample was absent, and finally all three comparisons were presented. Birds were reinforced for choosing the correct comparison for each particular sample. The longer the delay, the greater would be the confusion errors that the pigeon would make between similar stimuli. Analysis of the confusion errors provided evidence for whether the birds were retrospectively coding the sample or prospectively coding the correct comparison during the delay. The logic of the analysis was that if the birds were remembering the sample, then confusions between similar samples (e.g., red and orange) would increase as a function of delay length. Alternatively, if the birds were remembering the correct comparison to peck, then confusions between similar comparisons (e.g., vertical and slant orientation lines) would increase as a function of delay length. Roitblat found that confusion errors between comparisons increased as a function of delay length, supporting the interpretation that birds were prospectively coding the correct comparison stimulus. Roitblat concluded that “pigeons appear to store the information about the sample, not in terms of an image of the presented sample, but more similarly to an image of the correct choice stimulus” (Roitblat, 1980, p. 349).

Another early demonstration that images could be retrieved through associations comes from the work of Peter Holland. Holland poses an interesting question: “If a tone, previously paired with a food, actually induces perception of that food’s flavor, then would a rat made ill while engaging in such surrogate tasting develop an aversion to the food itself?” (Holland, 1990, p. 116). A rat that tastes an actual flavor followed by malaise will typically form a conditioned taste aversion (CTA) to that flavor (Garcia, Kimeldorf, & Koelling, 1955). But will an associatively retrieved image of a flavor undergo CTA as well? A series of experiments performed in the early 1980s seems to suggest that this is the case (Holland, 1981). Rats heard two tones (Tone 1 and Tone 2), each paired with a different flavored food (Food 1 and Food 2). Rats then heard Tone 2 followed by ingesting a toxin inducing stomach malaise. At test, rats consumed 46% more of Food 1 than Food 2. The reduced consumption of the flavor that had been paired with Tone 2 suggests that the presentation of Tone 2 in Phase 2 retrieved a memory of Food 2 including its flavor. The presence of the activated experience of Food 2’s flavor while the rat experienced malaise allowed the memory of the flavor to become associated directly with illness. Holland (1990) likened this associatively retrieved memory to an image, and subsequently showed that an associatively retrieved image plays a role in mediated extinction (Holland & Forbes, 1982) and cue interaction effects such as overshadowing and potentiation (Holland, 1983). More direct evidence that associatively retrieved memories have image-like properties is the finding that associatively learned representations of tastes can activate the same neural ensembles in gustatory cortex that are activated by the presentation of the actual taste (Saddoris, Holland, & Gallagher, 2009).

Reasoning about missing information

I now return to the role of an image in decisions made in the face of missing information. We rarely have direct access to all of the information about causal relationships that govern any particular system. To reiterate an example used above, a doctor can merely observe a patient present with red and watery eyes, a runny nose, swollen and red tonsils, and a low-grade fever to infer a hidden viral cause of these symptoms. Likewise, it was the odd, unpredicted movements of Uranus that led Alexis Bouvard in the early 19th century and later Urbain Le Verrier in 1845—both using the physical-causal system of Newtonian mechanics—to postulate the existence of the as-yet-undiscovered planet Neptune. Humans, even young children (Kushnir & Gopnik, 2005), readily reason about hidden causes when their presence is expected (Hagmayer & Waldmann, 2004; Kushnir et al., 2010).

In the process of studying causal learning and inference in the rat (Blaisdell et al., 2006),Footnote 2 we discovered that, like humans, rats also reason about hidden causes. Moreover, hidden causes were inferred based on prior associative or causal knowledge. For example, when a tone is followed by a light on some trials, and the light is followed by food on other trials during training, this should establish a tone→light→food causal (or associative) chain (Blaisdell et al., 2006, Experiment 2). Upon hearing the tone at test, the rats should expect to see the light turn on next, followed by the delivery of food. But the light was not presented at test, and the rats appeared to recognize the light’s absence for they did not look for food in the food hopper. This suggests that the rats expected the light, based on the presentation of the tone. The absence of the expected light caused them to also not expect the light’s other effects, such as food. To remedy this, we covered the light with an opaque shield at test. By doing so, the rats now looked for food in the food hopper when they heard the tone, despite no actual light being presented (Blaisdell et al., 2006, Experiment 2). Thus, rats seem to recognize that when the light was covered by an opaque shield, they should not be able to determine its presence or absence. Merely the presence of the tone at test was evidence enough, based on the prior tone–light contingency, for them to infer the presence of the obscured light.

This result was at the time surprising, for it reflects a greater sophistication of rationality in the rat than we had previously expected them to have. I now review a program of research conducted in my laboratory to further investigate how rats reason about unobservable events, that is, missing information.

Blaisdell et al. (2009) conducted a direct test of the hypothesis that rats distinguish between the explicit absence of an event from its ambiguous absence. Rats received sensory preconditioning treatment in which Tone X was followed by Light A in Phase 1, and Light A was followed by an appetitive US (sucrose) in Phase 2. Rats also received unpaired presentations of Noise Y in Phase 1.

At test, rats spent significantly more time looking for sucrose during presentations of X than of Y on nonreinforced probe test trials, but only if A’s light bulb was removed from the chamber at the time of testing. With the light bulb present and unlit, rats showed no difference in amount of nose poking between X and Y (see Fig. 3, left panel). These results suggest that rats distinguish between the explicit absence of A from its ambiguous absence. When the presence of A is ambiguous, it could be the case that rats imagine it is present when they have reason to believe it should be present, such as when its associate, X, is presented at test.

Fig. 3
figure 3

Left panel: Mean discrimination ratios for nose-poke responses during test trials with the second-order (paired) CS and the unpaired CS from Blaisdell et al. (2009), Experiment 2. Testing was conducted either with the light present or absent. Right panel: Mean discrimination ratios for nose-poke responses during test trials with the second-order CS with the light present or absent during testing. Testing occurred either in the same or different context from where training took place. Error bars represent standard errors of the mean

As an alternative to the hypothesis that rats can imagine expected events that are physically or perceptually absent, it could be argued that removal of the light bulb at test resulted in a context change. Because X was paired with A, but not with food, in Phase 1, it could have acquired some inhibitory properties. If removing the light bulb at test resulted in a context change, this could prevent X’s inhibitory properties from generalizing to this new test context, while its excitatory properties should generalize across contexts, thereby resulting in renewal of responding (Bouton, 1993; but see Bouton, 1994, for evidence against greater context sensitivity of inhibition than excitation to ambiguous stimuli). Thus, in an additional experiment, we replicated the procedure of Blaisdell et al. (2009), but then tested X with A’s bulb removed in either the same (training) context or a different context (Blaisdell & Waldmann, 2012). If the reason Blaisdell et al. (2009) found higher rates of nose poking in the light-absent test condition was due to a context shift created by removal of the light bulb, then by explicitly rendering the test context dramatically different from the training context, we should observe high rates of nose poking in both groups tested in the different context, regardless of the presence or absence of the light bulb. Contrary to the predictions of the renewal account, the right panel of Fig. 3 shows that testing X in a different context actually resulted in relatively little nose poking compared with pre-X baseline rates of nose poking. Only when X was tested in the same context and with A’s bulb removed did we observe rates of nose poking significantly above baseline rates. Thus, we can rule out a renewal account of the effects of removing the light bulb at test.

In another series of experiments, we were interested in whether rats also drew inferences about unobservable outcomes (Waldmann, Schmid, Wong, & Blaisdell, 2012). The paradigm used was simple. For example, in Waldmann et al.’s Experiment 3, training consisted of Pavlovian conditioning in which a light was paired with food. Rats then received an extinction phase during which the light was repeatedly presented alone in the absence of food. Rats were allocated to one of three treatment groups for the extinction phase. For Group Cover, the food niche was covered with a metal panel secured to the wall with two strong magnets, thereby preventing access to the food hopper (see Fig. 4a). For Group No Cover, a shield covered the food hopper but with a hole cut out of the center, thereby allowing access to the food hopper (Fig. 4b). The third group was a Generalization Decrement Control group in which no shield was present during extinction. This group allowed us to assess the degree of generalization decrement, if any, in the other two groups when extinction was carried out in the presence of the metal shield and subsequent testing took place in the absence of the shield. After extinction, rats received a final test session during which the cover was removed for all rats, and the light was presented on nonreinforced probe trials. Rats extinguished with the cover in place showed greater entries into the food niche during nonreinforced probe tests of the light (with the cover now removed) than did rats extinguished without the cover or with the cover that had a hole providing access to the food niche, suggesting that the rats understood that the cover blocked access to information about food delivery (Fig. 4, right panel). Because the light had always signaled food during Pavlovian training, rats behaved as if they continued to imagine that food was delivered during extinction, though they couldn’t verify this because of the presence of the cover. The rational approach to updating the contingency between two events requires evidence that the contingency has changed. If evidence is lacking, contingency information is not updated. Thus, in addition to being able to discriminate the explicit versus ambiguous absence of a cue, rats are also able to discriminate the explicit versus ambiguous absence of the outcome—in this case, a food reward. Like humans (Hagmayer & Waldmann, 2007; Kushnir et al., 2010), rats appear to recognize the conditions under which they should be able to observe an event and those conditions under which the event is unobservable.

Fig. 4
figure 4

Left panels: Pictures of the apparatus configurations in the Cover (top) and No Cover (bottom) conditions used in Waldmann et al. (2012). In the Cover condition, a metal plate blocked access to the drinking receptacle. Right panel: Mean difference (CS–pre-CS) scores (discrimination index) as a function of Phase 3 Test session for the Cover, No Cover, and Control conditions in Experiment 3 of Waldmann et al. Error bars represent standard errors of the mean

Does the cover always allow for an evoked image to influence responding, or are there conditions in which the image is not activated? A study by Fast and Blaisdell (2011) provides some insight into this question. Rats were trained on either a positive patterning (A−, B−, AB+) or negative patterning (A+, B+, AB−) instrumental discrimination with visual cues (A and B) signaling the presence (+) or absence (−) of a food reward (see Fig. 5, left). A positive patterning discrimination involves reinforcing lever presses only to the AB compound, but not on trials with only one or the other element (A, B) present. A negative patterning discrimination, on the other hand, involves reinforcing lever presses during trials with presentations of either element (A or B), but not on compound AB trials. Once discriminative control of lever pressing by the elements or compound was achieved, rats were tested with only A illuminated while B remained unlit (see Fig. 5, lower right) or occluded from view (covered) by an opaque metal shield (Fig. 5, upper right). Only rats trained on the negative patterning discrimination responded differently when B was covered (ambiguous) compared with when B was explicitly absent, lever-pressing less when B was covered compared with uncovered (see Fig. 6, left). This is consistent with the rat imagining that B was on when A was presented and B was covered. Because AB trials were never reinforced, responding was lower than on tests of A with B uncovered and explicitly off. Rats trained on the positive patterning discrimination, however, did not respond differently with B covered versus uncovered. Response rates were equally low, suggesting that rats did not imagine B to be on during A alone test trials with B covered.

Fig. 5
figure 5

Left and two center panels depict the three types of trial used in the positive patterning (top panels) and negative patterning (bottom panels) discrimination training procedures used by Fast and Blaisdell (2011). Right panels: Pictures of the apparatus at test with light B covered (top) or uncovered (bottom)

Fig. 6
figure 6

Left panel: Mean elevation scores from test trials with Cue A in Experiment 1 of Fast and Blaisdell (2011). Subjects in group negative patterning lever pressed more during Cue A with B uncovered than when B was covered. Subjects in group positive patterning showed equally low rates of lever pressing during Cue A with B covered or uncovered. Error bars represent the standard errors of the means. Right panel: Mean elevation scores from test trials with Cue A in Experiment 2 of Fast and Blaisdell (2011). Subjects in group negative patterning lever pressed more during Cue A with B uncovered than when B was covered. Subjects in group positive patterning showed more lever pressing during Cue A with B covered than when B was uncovered. Error bars represent the standard errors of the means

If there was something about negative patterning discrimination training, but not positive patterning discrimination training, that allowed for use of an image in the ambiguous test conditions, then rats that received both positive and negative patterning discrimination training might come to use the image of B on ambiguous tests for either type of discrimination. Fast and Blaisdell (2011) tested this in a second experiment in which rats were trained on both positive and negative patterning discriminations with visual and auditory cues (modality counterbalanced) before experiencing test trials with A illuminated while B was either covered or uncovered. Unlike the first experiment, rats that received both positive and negative patterning discrimination training now showed different levels of instrumental lever pressing when B was covered versus uncovered. Specifically, rats tested on the positive patterning discrimination responded more when B was covered than when it was uncovered (see Fig. 5, right). This behavior is consistent with the rats maintaining an image of B when B’s light was covered because these rats had always been rewarded for lever pressing when both lights occurred simultaneously during training (AB+). These results suggest that something about the negative patterning discrimination was necessary for rats to retrieve images of associated visual cues. We have more recently shown that the effect of concurrent training on positive and negative patterning discriminations on use of an image in the positive patterning discrimination was not due to simple alternative aspects of the task, such as concurrent training per se, or to the amount of reinforcement during training or the amount of training (Fast, Flesher, Nocera, Fanselow, & Blaisdell, 2016).

Neural basis of reasoning about missing information

The finding that an image of an ambiguously absent visual cue can guide instrumental responding in rats depends on learning a non-linear discrimination such as negative patterning discrimination (Fast & Blaisdell, 2011; Fast, Flesher, et al., 2016) provides a clue to the neural basis of reasoning about ambiguously absent events. Negative patterning discriminations (and configural learning tasks in general) have been shown to depend on a functioning hippocampus, while positive patterning discriminations (which can be solved by linear computational processes) do not (Alvarado & Rudy, 1995; Rudy & Sutherland, 1989; Sakimoto, Hattori, Takeda, Okada, & Sakata, 2013; Sakimoto & Sakata, 2013; but see Davidson, McKernan, & Jarrard, 1993). Furthermore, the hippocampus is recruited when human participants make inferences (Barron, Dolan, & Behrens, 2013; Kumaran, 2012; Reber, Young, Garlick, Pham, & Blaisdell, 2012; Zeithamova, Dominick, & Preston, 2012; Zeithamova, Schlichting, & Preston, 2012). Thus, the hippocampus appears likely to be a critical structure to support reasoning about ambiguous stimulus events.

In collaboration with Michael Fanselow, we investigated the role of the hippocampus in processing of an ambiguously covered event (Fast, Flesher, et al., 2016). We replicated the concurrent patterning procedure of Experiment 2 of Fast and Blaisdell (2011). After rats had learned concurrent positive and negative patterning discriminations, we gave the rats microinfusions of muscarinic antagonist scopolamine into the dorsal hippocampus. Scopolamine temporarily inactivates acetylcholine receptors and has been shown to disrupt negative patterning, but not positive patterning discriminations (Richmond, Nichols, Deacon, & Rawlins, 1997). Temporary inactivation of dorsal hippocampal ACh receptors had no effect on the positive patterning discrimination, which rats were still able to perform, but attenuated performance on the negative patterning discrimination (see Fig. 7, right). By comparison, microinfusions of the buffer solution but without scopolamine had no effect on rats’ ability to perform positive or negative patterning discriminations (see Fig. 7, left). Moreover, inactivation of dorsal hippocampus also abolished the difference in responding to Cue A on tests where Cue B was covered (ambiguously absent) versus uncovered and off (explicitly absent). Thus, the dorsal hippocampus appears to be necessary for processing an image of an ambiguously absent (covered) visual cue.

Fig. 7
figure 7

Mean lever presses occurring during 30-s A-alone tests minus baseline responses in the experiment reported by Fast, Flesher, et al. (2016). White bars reflect tests with B uncovered and gray bars illustrate tests with B covered. Rats responded more to A when B was covered compared with when B was uncovered or A-alone training trials following microinfusions of phosphate buffered saline (PBS), but not scopolamine (Scop). Error bars represent the standard error of the mean and * denotes significant differences between conditions

Additional evidence highlighting the role of dorsal hippocampus in processing an image of an ambiguously absent visual cue comes from imaging of immediate early gene (IEG) c-Fos (Fast, Flesher, et al., 2016). We replicated the instrumental patterning discrimination procedure of Experiment 1 of Fast and Blaisdell (2011), whereby some rats were trained on only the positive patterning discrimination and others were trained on only the negative patterning discrimination. Following training, rats received a single test session during which Light A was presented once before they were removed from the operant chamber for tissue processing. Half the rats in each training condition were tested on Light A with Light B uncovered, while the remaining rats in each training condition were tested on Light A with Light B covered. Expression of c-Fos in the hippocampus would tell us how much neural activity there was as a result of the interaction between prior training regimen and test condition prior to tissue staining. Figure 8 shows representative examples of IEG expression in the dentate gyrus within the dorsal hippocampus (left panels), and the quantified levels of IEG expression across all animals in each test condition (right panel). IEG expression, and therefore neural activity in dentate gyrus, was significantly higher only in those animals that had received negative patterning discrimination training AND that were tested on Light A with Light B covered. This was also the condition in which the cover had an effect on instrumental lever pressing (Fast & Blaisdell, 2011; Fast, Flesher, et al., 2016). These results suggest that, despite B being covered and thus its status being ambiguous, the activation of an image of B through presentation of its associate, A, were both necessary to induce IEG expression in hippocampus (see Fig. 8).

Fig. 8
figure 8

Left panel: Representative samples of c-Fos expression in dentate gyrus of the hippocampus for each test group in Fast, Flesher, et al. (2016), positive-patterning-uncovered (top left), positive-patterning-covered (top right), negative-patterning-uncovered (bottom left), and negative-patterning-covered (bottom right). c-Fos positive cells appear brown (DAB); all others appear blue or purple (Hematoxylin QS). Right panel: c-Fos expression in the dentate gyrus of the dorsal hippocampus following the test trial. Error bars represent the standard error of the mean. (Color figure online)

The ventral hippocampus has recently been implicated as a critical structure that allows the subject to discriminate real events from their images (McDannald & Schoenbaum, 2009). McDannald et al. (2011) developed a neurodevelopmental animal model of the positive symptoms of schizophrenia. They review evidence that early during training of a Pavlovian CS as a predictor of a food US, the CS elicits a highly realistic sensory representation of the US. This US representation is to some degree indistinguishable from the actual US. With further training, however, the realistic, sensory representation of the US is replaced by a more abstract US representation, such as a US expectancy, that is distinguishable from the actual US (McDannald & Shoenbaum, 2009). Thus, interfering with the neural circuitry mediating this transition should prevent the transition from taking place, and render the animal susceptible to hallucinations, that is the inability to differentiate an associatively-retrieved image of the US from the presentation of the actual US. This impairment in reality monitoring would thus model the positive symptoms of schizophrenia. McDannald et al. (2011) developed such a rat model by lesioning the ventral hippocampus in neonatal rats (NVHL). Once NVHL rats mature to adulthood, they should demonstrate failures in reality monitoring. To demonstrate this, NVHL (and unlesioned control) rats were tested in adulthood on two taste-aversion procedures. As expected, both control rats and NVHL rats formed aversions to palatable foods directly paired with nausea. Only NVHL rats, however, also formed food aversions when the CS that had previously been paired with food was itself (in the absence of food) paired with nausea. The failure of NVHL rats to discriminate actual food from an associatively retrieved image of the food parallels the failure of people with schizophrenia to differentiate internal thoughts and beliefs from reality.

The paucity of data on neural basis of imagery in nonhuman animals is far outstripped by the plethora of data from human research, especially from cognitive neuroscience. Nevertheless, the hippocampus is a common structure reported to be recruited in human studies of imagery. The hippocampus, along with prefrontal cortical areas, tend to be involved in top-down aspects of mental imagery, such as when a participant is asked to imagine a scenario (Schacter, Addis, & Szpunar, 2017). These same circuits are recruited in episodic memory tasks, suggesting a common function in remembering past episodes and simulating imaginary ones (Moulton & Kosslyn, 2009). The hippocampal-cortical circuit has been argued to play an important role “to facilitate predictions about the future. At the center of this perspective is the idea that the capture of associations that define event sequences is adaptive because these sequences can be reassembled into novel combinations that anticipate and simulate future events” (Buckner, 2010, p. 28).

Other areas that are expected to be recruited in animal tasks targeting mental imagery include association areas and sensory/perceptual areas of the imagined events themselves. For example, imagining a visual image should recruit dorsal and ventral pathways of the visual stream in the parietal and temporal cortices, respectively, and with suitably-designed tasks, even primary visual areas of the occipital cortex (Kosslyn, 2005). Images of tastes and sounds should recruit gustatory and auditory areas, respectively. Identification of such imagery-recruited neural activity in animals would allow for more sophisticated tools, such as ontogenetics and chemogenetics, to interrogate the causal role these circuits play in mental imagery.

Theoretical accounts of reasoning about missing information

The research described above was motivated by the hypothesis that an associatively retrieved event representation can act as an image to influencing learning (Holland, 1990; Waldmann et al., 2012) and decision-making (Fast & Blaisdell, 2011). As with humans (Kosslyn, 2005), the image can include activation of sensory features normally activated by the event itself, and can enter into new associations as well as drive decision-making behavior similarly to the event itself. This hypothesis has not gone unchallenged. Let us consider a simple case in Experiment 1 of Fast, Biedermann, and Blaisdell (2016). Rats received Pavlovian conditioning in which a light–tone compound CS was paired with delivery of a food US (see Fig. 9). After a few sessions of such training, rats showed high rates of nose poking into the food niche during CS presentations. Rats then received a test session during which the tone was presented alone. When the light bulb where the light had been presented during training was visible and off (explicitly absent), generalization decrement in responding to the tone alone was observed (see Fig. 9). If the light bulb was covered with a metal shield, however, presentations of the tone alone did not produce generalization decrement, and instead very strong conditioned nose poking was observed. According to Fast et al., rats should have learned a light-tone within-compound association, which allowed the presentation of the tone at test to retrieve the memory of the light. Observation of the darkened bulb when it was expected to be on produced generalization decrement (cf. Bouton, Doyle-Burr, & Vurbic, 2012). Covering the bulb, however, allowed rats to imagine that the light was on even though they couldn’t see it. Since the light must be on (as it always had been when the tone was present during training), they responded as if the light–tone compound was present.

Fig. 9
figure 9

Top panel: Schematic representation of proposed mechanisms mediating behavior when a relevant cue is blocked from detection. a During A-alone test trials with B unlit, the excitatory BON is either explicitly absent, or the inhibitory cue BOFF is present. b When B is covered at test, the representational account proposes that A retrieves a representation of light B. This representation (and its related associations with the outcome) influences behavior. If B predicts that the outcome will occur, behavior will be invigorated (Experiment 1); however, if B predicts that the outcome will not occur, behavior will be suppressed (Fast, Biedermann, and Blaisdell, 2016; Experiment 2) relative to when B’s absence is explicit (a). Alternatively, the Nonrepresentational Account suggests that covering light B removes cue BOFF, leaving only the AON association to drive behavior. Bottom panel: Further elaboration of cues that are present during training according to the nonrepresentational account. AB+ training trials result in AON and BON cues becoming excitatory. During the intertrial interval (ITI), AOFF and BOFF cues are predicted to become inhibitory because they signal the absence of food

In contrast to the representational account just described, Dwyer and Burgess (2011) offered an alternative nonrepresentational account. According to their account, the cues that become conditioned include all cues present in the conditioning chamber. These include cues that are on, such as the onset of a light or tone, as well as cues that are off, such as the unilluminated light bulb and soundless speaker. For example, in Fast, Biedermann, and Blaisdell (2016), Experiment 1, food always occurred following presentations of the tone and light. Thus, tone ON and light ON each acquired excitatory properties (see Fig. 9). Food never occurred, however, during the intertrial interval during which time the tone and light were absent. Thus, tone OFF and light OFF could acquire inhibitory properties. At test, presentation of just the tone was a condition of tone ON and light OFF, which is one excitatory cue and one inhibitory cue. Thus, responding was less than during training when both excitatory cues were present (tone ON and light ON). Covering the light, however, resulted in removing the inhibitory light Off cue. Thus, when tone was presented, there was an excitatory tone ON cue but no inhibitory light OFF cue. As a result, covering the light led to strong responding, similar to during training. Dwyer and Burgess present this clever alternative explanation for the results of Experiment 1 of Fast, Biedermann, and Blaisdell (2016).

While both theoretical explanations account for the results of Fast, Biedermann, and Blaisdell (2016), and the results of Fast and Blaisdell (2011) and Fast, Flesher, et al. (2016), they employ very different mechanisms. Dwyer and Burgess’s (2011) nonrepresentational account involves the removal of associative cues that had been present during training. Our representational account involves the active retrieval of a representation (and its associative or causal value) to exert behavioral control.

Fast, Biedermann, and Blaisdell (2016) designed a test to doubly dissociate these two accounts in Experiment 2 using a Pavlovian conditioned inhibition procedure (see Fig. 10). Rats first learned a Pavlovian conditioned inhibition discrimination in which separate presentations of two auditory cues, A and B, were followed by the delivery of food. No food was delivered on compound trials with AX (i.e., A+, B+, AX−). After confirming in a summation test that X was acting as a conditioned inhibitor through its ability to inhibit responding to B (see Fig. 10, lower left), rats received an X-absent test in which they were tested separately on A alone and B alone (A−, B−) with inhibitory cue X’s bulb either uncovered or covered. The representational account predicts that, given that A and X had been paired, but B and X had not been paired during training, the A–X within-compound association allows only A, and not B, to retrieve a representation of X at test. Thus, when X is covered at test, the representation of X retrieved by A should inhibit excitatory responding to A, while excitatory responding to B should not be inhibited. The nonrepresentational account of Dwyer and Burgess (2011), however, is that A+, B+, AX− training should be considered as AONBOFFXOFF➔Food, AOFFBONXOFF➔Food, and AONBOFFXON➔No Food pairings. As a result, AON and BON are each paired with food 50% of the time, and thus should become excitatory cues. XON is paired with no food and thus should become an inhibitory cue. Finally, and critically, XOFF is paired with food on 100% of trials (not counting the ITI), and thus should become an excitatory cue. Covering X at test should remove the excitatory XOFF cues on tests of both A and B, and thus, responding should be reduced on both A and B test trials when X is covered compared to when X is uncovered and off, in which case it serves as an excitatory cue. Simulation of the nonrepresentational account shows these predictions (see Fig. 11, left).

Fig. 10
figure 10

Top panel: Design of Experiment 2 of Fast, Biedermann, and Blaisdell (2016). Subjects required approximately 40 sessions to master the conditioned inhibition training. After demonstrating mastery of the discrimination, subjects received a summation test in which the ability of X to inhibit responses to another well-established CS (B) was assessed. Following summation, one refresh session identical to conditioning was conducted. Rats then received the critical X absent test in which probe trials of A and B were presented while X was explicitly absent (Uncovered) or ambiguous (Covered). Subjects then received one additional refresh session before advancing to the retardation-of-acquisition test of inhibition. Bottom Left panel: Mean elevation scores (nose poke duration during CS—nose poke duration before CS) during summation test of Fast, Biedermann, and Blaisdell (2016), Experiment 2. Error bars represent the 95% confidence interval, *** reflect differences between compared trials (indicated by the horizontal line positioned over the bars) at the .001 level of significance. At the conclusion of training (A), subjects responded at equivalently high levels to both CS A (white bars) and CS B (gray bars) with a significant reduction during AX- trials (black bars) and during novel BX- trials (dark gray, rightmost bar) compared with B+ trials, as evidence of successful summation. Bottom Right panel: Mean elevation scores during the critical X absent test of Fast, Biedermann, and Blaisdell (2016), Experiment 2. ** reflect differences between compared trials (indicated by the horizontal line positioned over the bars) at the .01 level of significance while ‘ns’ indicates nonsignificant differences. The horizontal, dashed line represents mean CRs to CS A and CS B (that did not differ) during the preceding standard summation session. Responses to B (right) did not differ, however, subjects responded significantly less to A (left) when X was Covered (gray bars) compared with Uncovered and unlit (white bars).

Fig. 11
figure 11

Left panel: Simulation of Pearce’s (1994) configural theory for Experiment 2 of Fast, Biedermann, and Blaisdell (2016) with OFF cues as suggested by Dwyer and Burgess (2011) for their nonrepresentational account. Right panel: Results of tests of A and B during the X-absent test reported by Fast, Biedermann, and Blaisdell (2016), Experiment 2. Note that the simulation of the nonrepresentational account fails to accurately predict the difference in responding to A versus B during the X-absent test with X covered. The results instead support the representational account of Fast and Blaisdell (2011)

What were the results of this experiment? With X’s bulb uncovered, nose-poke responses on A trials and B trials were high, indicating expectation of food. This is consistent with both accounts. When X was covered, however, nose-poke responding was inhibited during A, but not during B (see Fig. 10, lower right). This pattern of results is consistent with the representational account. The reduced responding to A but not B when X was covered at test suggests that an image of X was retrieved, leading rats to believe that X was present only during presentations of A. The retrieved representation of X was therefore able to inhibit excitatory responding to A (i.e., mediated conditioned inhibition in Holland’s, 1990, terminology). The lack of a B–X within-compound association prevented B from retrieving a representation of X, and thus no inhibition by X’s representation was possible.

The results of the X-covered tests fail to support the predictions of the nonrepresentational account of Dwyer and Burgess (2011; Fig. 11, left). Because covering X should have removed the excitatory XOFF cue, responding should be equally reduced on both A and B probe test trials.

Additional evidence for the retrieval of an image of X by its associate A comes from comparisons of responding to A prior to and following the X-absent test of A and B with X covered or uncovered. The experimental design in the top panel in Fig. 10 shows that refresher training sessions consisting of A+, B+ and AX− were given prior to and following this summation test. As expected, A elicited high rates of responding during the refresher session prior to the summation test (see Fig. 11). This was true for rats that would be subsequently tested on A with X uncovered and covered. During the summation test that followed, A was presented on nonreinforced probe trials, allowing for some extinction to occur. If the presentation of A on nonreinforced summation test trials retrieves an image of X, then when X’s bulb is uncovered, subjects should be able to determine that X is explicitly absent, and thus A alone should undergo normal extinction. If X’s bulb is covered, however, then the image of X should remain active. The rats should therefore treat this trial as an AX− trial, as discussed above, in which case no extinction to A should occur. That is, the presence of X’s image should protect A from extinction. Indeed, at the beginning of the refresher training session following the summation test, we observed normal extinction to A if X had been uncovered, but no extinction to A if X had been covered, confirming that X’s image in the Covered condition protected A from extinction during the summation test (Fig. 11, lower right).

The strong excitation to A in Group Covered versus Uncovered affected the subsequent retardation-of-acquisition test of inhibition to X (see Fig. 11). In a retardation-of-acquisition test, the putative inhibitory stimulus is paired directly with the US to establish an excitatory conditioned response. If the putative inhibitor is truly inhibitory, then it should be slow to acquire an excitatory conditioned response when compared with a novel cue (Y). If we plot responding to X across sessions during retardation-of-acquisition training separately for rats in group Covered and Uncovered, we see that excitatory responding to X increases rapidly in Group Uncovered, but not at all in Group Covered. Thus, Group Uncovered showed little to no retardation-of-acquisition effect, showing weak if any inhibition to X. Why would inhibition to X be weak in Group Uncovered? Probably because the training excitor, A, had undergone extinction during summation testing as discussed above. Even after refresher training, it appears that A was not strongly excitatory, and thus, could no longer support strong inhibitory responding to X (Hallam, Matzel, Sloat, & Miller, 1990; Lysle & Fowler, 1985).

Group Covered, on the other hand, showed a strong retardation-of-acquisition effect, indicative of strong inhibitory value. This was expected because during summation testing, X was covered, allowing the image of X to protect the extinction of A’s excitatory value. During retardation-of-acquisition training sessions, A’s strong excitation supported strong inhibition to X, which translated into a strong retardation effect.

Unresolved issues

The evidence by Fast, Biedermann, and Blaisdell (2016) Experiment 2 provides the strongest support to date for the representational account of how rats deal with missing information (see Fig. 12). Nevertheless, the psychological basis of this representation is still an open question. It may be merely an expectation without memory or perceptual components, it can take the form of a memory (including beliefs or propositional knowledge), or it may possibly even form an image in the animal’s “mind,” in the same way that imagery can be part of the quality of the human mind. This last possibility appears the most intractable to empirically validate. Nevertheless, procedures could be developed to differentiate between these accounts.

Fig. 12
figure 12

Top panel: Design of Experiment 2 of Fast, Biedermann, and Blaisdell (2016). Dashed circle indicates session from which retardation test data were collected. Bottom left panel: Mean elevation scores (nose-poke duration during CS–nose-poke duration before CS) normalized to first session during retardation-of-acquisition test. Mean responses to X+ (solid lines) and Y+ (dashed lines) trials across the five retardation sessions for subjects previously tested with X uncovered (black lines) or covered (gray lines). Despite equivalent performance on the first session, only subjects that previously experienced the X-absent test while X was uncovered showed a significant increase in responses to X across the retardation-of-acquisition sessions. Bottom right panel: Subjects in group X uncovered also showed a significant reduction in responses to the first A+ trial of the refresher session that preceded retardation and followed the X-absent test

Indeed, the idea that animals may experience mental images is not as far-fetched as it might initially seem, nor reflect an anthropomorphic perspective on the animal mind. Darwin (1871) was the first to propose an evolutionary, comparative account of the notion of mental continuity between Man and other animals. This account is grounded in sound principles of shared brain evolution resulting in behavioral homologies reflecting shared psychological processes. Barsalou (1999) established a strong argument for the view that cognition is grounded in perceptual symbol systems (see Damasio, 1989, for a similar argument for memory). This should be true not just for humans but for any organism with an advanced enough nervous system that has evolved “privatized” sensations (Humphrey, 2000) that can be associatively linked and retrieved. Relevant to the aims of our research, Barsalou argues with strong empirical support (e.g., Crammond, 1997; Jeannerod, 1994; Kosslyn, Thompson, Klm, & Alpert, 1995; Zatorre, Halpern, Perry, Meyer, & Evans, 1996) that “sensory-motor systems represent not only perceived entities but also conceptualizations of them in their absence. From this perspective, cognition penetrates perception when sensory input is absent, or when top-down inferences are compatible with sensory input” (Barsalou, 1999, p. 589). Thus, “when bottom-up information conflicts with top-down information, the former usually dominates. When bottom-up information is absent, however, top-down information penetrates, as in mental imagery” (Barsalou, 1999, p. 589). This mirrors our proposed representational account of the effects of covering a light to create an ambiguous situation. When an associate of the light is present, and the light is explicitly off (uncovered), the bottom-up information of the light being off dominates. When an associate of the light is present, but the light is ambiguously absent (covered), then the top-down information acquired previously can dominate and lead the rat (like the human) to imagine that the light might (or must, depending on the strength of the prior contingency) be on.

One phenomenon that defies explanation in the framework developed by Barsalou (1999) that cognition is grounded in perception comes from reports of aphantasia in a small minority of people. Zeman, Dewar, & Della Sala (2015) coined the term aphantasia based on reports of certain people being unable to engage in visual representations, such as in their “mind’s eye.” While documentation of this phenomenon is rare, it certainly seems to be real (de Vito & Bartolomeo, 2016; Keogh & Pearson, 2018; Zeman et al., 2016). How can existence of aphantasia be reconciled with the thesis that cognition is grounded in perception? One possibility is that individuals reporting aphantasia actually do experience mental imagery, but that such experiences are not accessible to conscious awareness. Individuals vary in their reported experience of mental imagery, falling along a spectrum of vividness. At one end of the spectrum are people, such as schizophrenics, who suffer from vivid hallucinations that they have trouble distinguishing from reality. At the other end of the spectrum may be aphantasics, who report minimal or no mental imagery. Perhaps individuals reporting aphantasia experience mental imagery at the fundamental, preconscious level, such as in early visual and auditory processing, but by the time the information reaches consciousness, it has completely been transformed to other forms, such as propositional knowledge, beliefs, causal maps, cognitive maps, and semantic memory. Spatial imagery has been found to be as good as, if not better, in aphantasics compared with normal individuals, despite little to no visual imagery experiences (Keogh & Pearson, 2018). Research on aphantasia is in its infancy, with only a handful of reports; thus, more research is needed to understand this phenomenon. It might prove fruitful to look for individual differences in strength of mental imagery in animal studies as well.

It is clear from the studies discussed in the beginning of this article that most humans show evidence of true visual imagery. Furthermore, image retrieval in humans is elicited by verbal prompts and other forms of cross-modal transfer. For example, Cronly-Dillon, Persaud, and Gregory (1999; see also Cronly-Dillon, Persaud, & Blore, 2000; and review by Poirier, De Volder, & Scheiber, 2007) trained blindfolded (or blind) subjects on auditory–visual figure associations (the blind participants previously had normal vision which they lost sometime in their lifetime, thus, they could form mental visual images). Following training, subjects could retrieve separate visual forms from auditory associates and construct the completed visual image. Blindfolded participants also experience visual illusions and other visual imagery effects in auditory substitution tasks (Renier, Bruyer, & De Volder, 2006; Renier et al., 2005), and these effects are argued to reflect a visual mental imagery process.

But what about nonhuman animals? We know that rats can form excitatory associations between two associatively retrieved memories. For example, rats were given a peppermint-flavored solution to drink in Context 1, and an almond-flavored solution containing sucrose to drink in Context 2. This should establish an almond–sucrose association. Rats then were given the almond solution alone (without sucrose) in Context 1. At test in a novel context, these rats drank more peppermint solution than did rats in various control conditions omitting one or more of these prior experiences (Dwyer, Mackintosh, & Boakes, 1998; see review of these phenomena by Pickens & Holland, 2004). Thus, the association between Context 1 and peppermint resulted in the retrieval of a memory of peppermint during the second phase. Likewise, the association between almond and sucrose enabled the presentation of almond in Phase 2 to retrieve a representation of sucrose. The coincidentally activated representations of peppermint and sucrose enabled an association to form between them, thereby leading rats to increase their consumption of peppermint (now associated with sucrose) in the final test. It has been argued that this type of mediated associative learning between absent events plays a central role during development (Cuevas, Rovee-Collier, & Learmonth, 2006). But is there any evidence yet for true visual imagery when memories for events are retrieved? The study by Fast, Biedermann, and Blaisdell (2016) comes closest, but still fails to discriminate between a true image of the absent cue or a salient memory and expectation of its presence, but without the sensory qualities true images entail. Visual imagery in animals remains an open question, but one that we feel is experimentally tractable. If animals are found capable of forming mental images, then the higher-order associative phenomena found by Dwyer et al. (1998) might allow animals to also create novel mental images through the combination of separate associatively retrieved images, as proposed by Dennett (1995) in his thought experiment, and that support top-down mediated mental simulations in humans (Moulton & Kosslyn, 2009). Such a process would play a functional role in simulating future scenarios, a process that some memory researchers believe is a more important role of the episodic memory system than the formation and retention of episodic memories themselves (Addis, & Schacter, 2012; Schacter, Norman, & Koutstaal, 1998).

Inconsistencies in the role of mental images of absent events

The evidence in our lab shows that rats distinguish between an event that is explicitly absent and one that is ambiguously absent, such as when the source of the event (e.g., a light) is perceptually obscured (e.g., by an opaque shield). Yet, as already mentioned, others have found mediated conditioning even when an associated event is explicitly absent. The experiments reviewed by Holland (1990) show mediated conditioning processes, such as acquisition, extinction, and overshadowing, as well as characteristics of the conditioned response itself. The experiments on mediated conditioning in our lab and by Holland (1990) and Hall (1996) are all examples of positive mediated conditioning. In positive mediated conditioning, the change in associative value of the absent event is the same as that of the presented event. For example, if a light and tone are paired during Phase 1 of sensory preconditioning, and then the tone is paired with food in Phase 2, the light will now elicit responding as if it had been itself directly paired with food. Another type of mediated conditioning involves the opposite change in associative value to the absent event compared to the presented event. Examples of such negative mediation come from studies of retrospective revaluation. An example of retrospective revaluation is the recovery from cue-competition effects. For example, in a blocking procedure, CS A is paired with a foot shock US in Phase 1, and then compound CS AX is paired with foot shock in Phase 2. This treatment often results in CS X eliciting less conditioned responding than it would have if A had not been paired with the US in Phase 1 (but see Maes et al., 2016). If A is sufficiently extinguished after Phase 2 of training, however, then responding to X will increase (recover) at test. Recovery of responding to X occurs despite no further training of X, and despite the fact that X was absent during the extinction treatment of A (Blaisdell, Gunther, & Miller, 1999). The associative value of X increased as a result of that of A decreasing (see Blaisdell, 2003, for a review of retrospective revaluation effects in Pavlovian conditioning). A number of models have been proposed to account for retrospective revaluation (e.g., (Dickinson & Burke, 1996; Stout & Miller, 2007; Van Hamme & Wasserman, 1994). The transition from positive mediation to negative mediation seems to take place as a function of amount of training (Stout, Escobar, & Miller, 2011; Yin, Barnet, & Miller, 1994). In contrast to my lab in which positive mediation is found only for ambiguously absent events, positive and negative mediation of explicitly absent events has been found in other labs. A resolution to this discrepancy will be discussed below.

The role of missing information in contingency updating

Through experience, we learn the statistical regularity of relations between events in the world. This contingency learning requires the detection of all events that happen in close spatial and temporal contiguity to each other, and form the basis of Bayesian and causal model analysis (Chater & Oaksford, 2008; Waldmann, Cheng, Hagmayer, & Blaisdell, 2008). Statistical learning of associations not only builds our knowledge base, it also enhances item representations and their relations (Barakat, Seitz, & Shams, 2013). Nevertheless, many aspects of our environment are nonstationary, which necessitates that we update our contingency knowledge when relations among events change (Racey, Young, Garlick, Pham, & Blaisdell, 2011). A central concern over the history of learning theory in both animals and humans has been the principles by which knowledge is updated and behavior changes in the face of changing statistical contingencies (Mackintosh, 1975; Rescorla & Wagner, 1972). Because learning environments are always impoverished, these principles bear critical consideration in all types of learning, such as machine learning (Karklin & Lewicki, 2005) and the acquisition of language in children (Thomas, 2002). Indeed, this is the central topic of empirical and theoretical questions about learning, extinction, the transition from excitatory to inhibitory representations (Baetu & Baker, 2009; Yin, Barnet, & Miller, 1994), retrospective revaluation (e.g., Blaisdell, 2003; Denniston, Savastano, & Miller, 2001; Dickinson & Burke, 1996; Van Hamme & Wasserman, 1994), and mediated learning (Cuevas et al., 2006; Pickens & Holland, 2004) of contingencies and stimulus–stimulus representations. The Bayesian statistical framework has also illuminated the role of neurotransmitters in modulating contingency learning in conditions of uncertainty (Yu & Dayan, 2005).

Currently, however, learning theories ignore the issue of contingency learning and updating about ambiguously absent events. It is still under debate whether humans and other animals learn contingency and perceptual information in ambiguous situations following principles of associative learning (e.g., Castro, Wasserman, & Matute, 2009) or Bayesian modeling (Fiser, 2009). The tenet of causal model theory (and causal Bayes nets) regarding hidden or latent causal variables provides a unique insight into this process (Carroll, Cheng, & Lu, 2013; Gopnik et al., 2004; Hagmayer & Waldmann, 2007; Luhmann & Ahn, 2007). Specifically, causal Bayes net theories can make specific predictions for different patterns of observations by using Bayesian inferences about unobserved causes (see Hagmayer & Waldmann, 2007, for a detailed analysis). It is more likely that an unobserved second cause is also present when a cause and an effect are both observed than when the cause is observed without an effect. Even if mediated conditioning is viewed as an associative, rather than causal, phenomenon, it may still reflect a different type of acquisition process than that involved with forming associations between physically paired events (as opposed to associatively retrieved events; Lin & Honey, 2016).

What role might imagery play in modulating decisions about contingency updating in ambiguous situations? It would be of great interest to study the role of imagery in contingency learning, such as in the acquisition of new associations, behavioral extinction, the transition from excitatory to inhibitory responding, and retrospective revaluation effects. Our prior evidence for imagery in rats suggests the hypothesis that, when information might result from obscured perception, the subject (rat or human) entertains the possibility that the missing event is actually present. Furthermore, when the subject holds a belief that the absent event might indeed be present (e.g., when the absent event’s associate is present), it should be more conservative about updating prior contingency information, compared to when the expected event is explicitly absent (e.g., when the source of the absent event is observable). This hypothesis is consistent with the tenets of causal model theory and causal Bayes nets, but has not yet been empirically addressed. One prediction is that rats should be more conservative to update event contingencies when an expected event is ambiguously absent (e.g., by being covered) compared with explicitly absent, such as was reported by Waldmann et al. (2012).

As discussed above, contingency updating of an explicitly absent event has been reported both for positive and negative mediation. Nevertheless, such empirical findings tend to be quite weak, and the theories devised to explain them fail to distinguish updating of a cue’s value depending on whether the cue is covered or uncovered. The imagery hypothesis predicts that covering the source of the absent event (e.g., X’s bulb in the experiments by Fast, Blaisdell, and associates) can render the updating of X much stronger when an associate of the absent event undergoes a change in contingency. Thus, if A and X were to become associated in Phase 1 of treatment, we would predict X to undergo similar changes in associative strength as A during a second Phase 2 extinction treatment, but most readily when it is covered. Mediated extinction is sometimes found, though it is a weak effect (e.g., Holland, 1990), but we would expect mediated conditioning to be significantly stronger when the missing cue is expected to be present and it is covered, thereby preventing the subject from noticing its explicit absence. If found, such results would hold interesting implications for reality monitoring and the ability to differentiate direct perception from hallucinations. Hallucinations can be thought of as weaker or less salient versions of actual sensory/perceptual events (Aleman, Nieuwenstein, Böcker, & De Haan, 2000). The experience of hallucinations in some forms of psychopathology, such as schizophrenia, might relate to an impairment in brain systems involved in reality monitoring. Thus, an animal model of reality monitoring and hallucinations involving associatively-retrieved images would be a useful tool for understanding how brain systems contribute to reality monitoring and its impairment in psychopathology.

Conclusions

The ability of a stimulus to retrieve the representation of another absent stimulus via an associative or causal link forms the basis of causal cognition (Blaisdell, 2009; Blaisdell & Waldmann, 2012), episodic memory (Clayton & Dickinson, 1998; Crystal, 2009), perceptual memory (Barsalou, 1999; Goldstone & Barsalou, 1998), representation of outcome quality (Balleine & Dickinson, 1998), perceptual binding (Postma, Kessels, & van Asselen, 2008), pattern completion (Fast, Biedermann, & Blaisdell, 2016; Fast & Blaisdell, 2011; Fast, Flesher, et al., 2016; Rudy & O’Reilly, 2001), and, some have argued, image and action (Fast, Biedermann, & Blaisdell, 2016; Holland, 1990). Mental imagery might also play a mediating role in a diverse range of behavioral phenomena in animals, such as mental time travel (Cheke & Clayton, 2010; Clayton, Bussey, Emery, & Dickinson, 2003), flexible use of prospective and retrospective coding (Cook, Brown, & Riley, 1985); prospective memory (Crystal, 2013; Wilson, Pizzo, & Crystal, 2013), and encoding of a cognitive map (Blaisdell 2009; Savastano & Miller, 1998; Wikenheiser & Redish, 2015).

We have explored the evidence that animals such as rats can distinguish the explicit absence of an expected event from its ambiguous absence. For example, as Fast and Blaisdell (2011) have shown, covering a light that had previously served as a discriminative cue for a reinforced response causes the rat to act as if it considers the possibility that the obscured light is present. Whether or not the obscured light is present is highly relevant to whether food can be expected for pressing a lever. This type of reasoning process depended on a functioning hippocampus (Fast, Flesher, et al., 2016), and involves the associative retrieval of a representation of the missing event (Fast, Biedermann, & Blaisdell, 2016).

This work is important because the capacity to resolve ambiguities in contingency learning in rats, a species for which learning processes are in the literature predominantly modeled by associative theories, has only recently come under empirical scrutiny. These issues have their origins in philosophy (e.g., Aristotle, Hume, Locke, Kant), and only recently have philosophers developed insights into how humans and other agents apply reasoning to discover accurate causal representations of the world (e.g., (Glymour, 2001; Pearl 2014; Spirtes, Glymour, & Scheines, 1993; Woodward, 2005). Since the foundations set by Thorndike and other behavioral psychologists, the focus in associative theorizing has changed from a behaviorist S–R perspective to a more cognitive focus on mental representations of events that take part in learning (Rescorla, 1988; Tolman, 1948). Currently, Pavlovian conditioning is in most cases modeled as involving associations between mental representations of the world (S–S learning; see Holland, 1990, for an overview). According to this view, cues may retrieve representations of predicted outcomes (e.g., food), which in turn may elicit behavioral responses. Moreover, cues may also retrieve representations of each other (Larkin, Aitken, & Dickinson, 1998), or cues may induce perceptual processing of outcomes in their absence (Holland, 1990).

Allowing representations to be part of the learning process dramatically increases the power of learning and breadth of conditions under which learning can take place. Nevertheless, an active role for such representations creates the problem of reality monitoring (see also Holland, 1990; Konorski, 1967). How does a learning system operating on retrieved-representations decide whether they reflect reality or just illusory correlations? If in extinction learning, for example, a cue is followed by a representation of the outcome, how is it possible that an organism learns that the underlying contingency has changed?

To avoid insensitivity to the changes in the world, the learning system needs to be able to compare retrieved representations to real events that are currently taking place, and distinguish between physically present events and events whose representations are retrieved but for which the physical event is not currently present. Holland (1990) reported evidence showing that rats indeed distinguish between retrieved representations and physically current events. This distinction is easy when the retrieved representations can be directly compared to experiences of present events (e.g., food appearing in a food hopper). Holland has subsequently found that the ability to distinguish retrieved images from real sensory experiences, what psychologists refer to as reality testing, emerges gradually during Pavlovian conditioning (Holland, 2005; see also McDannald & Schoenbaum, 2009).

When no sensory information about the presence of the event is available, however, the organism needs to decide whether the event is indeed absent or whether the current information about the event is inconclusive because perceptual access to the event is blocked. For example, as described above, Fast and Blaisdell (2011) found evidence that, after acquiring a negative-patterning instrumental discrimination involving two lights, if one light was covered by an opaque shield at test, the rat held in memory an active representation of the covered light, which then influenced the rat’s expectation that food should not be present, thereby reducing the rate of lever pressing.

The fruits of this research can provide insight into the fundamental nature of reasoning processes, which can help refine existing models of learning and inference and guide the development of new models. This work also holds implications for cognitive science and philosophy by helping to discern the unique elements of human thought processes from those shared with other species. Knowledge of the similarities and differences can inform on the evolution of causal reasoning. Furthermore, studies of reasoning in humans can be difficult to interpret because language can contaminate the detection and measurement of more fundamental processes shared with other species. For example, when considering theories that human cognition is grounded in perception (Barsalou, 1999; Rissman & Wagner, 2012) versus amodal (Fodor, 1983), the symbolic nature of human language can bias behavior to appear amodal. Even language or token training in apes and parrots can change their cognitive capacities (e.g., Pepperberg & Gordon, 2005; Premack, 1983; Thompson, Oden, & Boysen, 1997). By studying nonlinguistic species directly, such as the rat, we can better isolate, identify, and study rational processes in the absence of contamination by language. In addition to the influence of language on imagery, the reverse may also be true, that imagery can influence language. Paivio (1975; see also Paivio, 1991) reviews the early research on the relationship between verbal and imagery processes, and concludes that verbal processes are dependent on mental imagery processes, especially as sentence length increases. While this dual-coding hypothesis has been challenged (Kosslyn, Pinker, Smith, & Shwartz, 1979), the influence of mental imagery, in particular visual imagery, on language is well documented (Paivio, 1991). Perhaps the comparative work in animals can ultimately shed light on a resolution. If animals are shown to be able to use mental imagery, and humans evolved from nonverbal ape ancestors that had neurophysiological mechanisms of mental imagery, and furthermore during the evolution of human language these neurophysiological systems remained intact (i.e., were not dismantled by natural selection), then it stands to reason that humans continue to have this more ancient mental imagery system to which a language or propositional system must have been a more recent addition. Regardless of whether or not mental imagery ultimately can influence language, it would be difficult to argue, in light of such comparative research in animals, that a system of mental imagery in humans must be subservient to or emergent from a propositional/language system as proposed by Kosslyn (1979) and colleagues.

Answers to these questions can extend our knowledge of the evolutionary origins of rationality and separate purely rational processes from those predicated on linguistic symbolic processing. Thus, we can learn what aspects of reasoning are uniquely human and what aspects we share with other animals. This research can also lead to critical new insights into and new testable questions about the cognition and neuroscience of physical cognition, and processes of imagination and counterfactual reasoning, all of which play a central role in how individuals reason about the world. In sum, this research may inform on future explorations of, and constrain theories about, rational processes at the computational, cognitive, and neural levels of analysis. The ability to engage in imagination and mental simulation when information is incomplete serves as the basis of hypothesis generation and testing, a foundation for the scientific method. These investigations may even have an impact on more profound philosophical issues such as intentionality, logics, moral reasoning, and self-knowledge.