One of the oldest areas of inquiry in comparative psychology, stretching back well more than 100 years, is centered on the numerical abilities of other species (Beran, 2017; Beran et al., 2015; Boysen & Capaldi, 1992; Davis & Perusse, 1988). Dr. Sarah (Sally) Boysen has provided some of the most compelling data suggesting that chimpanzees can approximate some of the early numerical abilities shown by children. She showed that, for example, her chimpanzees could move to multiple locations in a room, see multiple sets of items, and then provide the Arabic numeral label matching that number of items (e.g., Boysen & Berntson, 1989). Any celebration of her career and contributions would focus heavily on the value of those studies (e.g., Boysen, 1992; Boysen & Hallberg, 2000). However, I am going to focus on a series of studies that was designed to look at numerical cognition but ended up instead providing one of the most interesting cases of difficulty experienced by animals in a task of inhibitory control, or the ability to avoid engaging in prepotent responses to stimuli that are goal irrelevant or suboptimal in a choice situation. I will outline the original findings, extensions of that work, and how those papers contributed to my own thoughts about another area of inquiry—studying self-control and delay of gratification across species.

The reverse-reward contingency task

In Boysen’s reverse-reward contingency task (or reverse-reward task), she had two chimpanzees sit near each other (Boysen & Berntson, 1995). One chimpanzee was the subject and was given the choice between two sets of food rewards. That chimpanzees pointed to one set, and the selected set was then given to the other chimpanzee. What was left (unchosen) was given to the subject. So, the rule, put simply, was to point to what you did not want, as that would be taken away and given to another chimpanzee. This introduced a strong inhibition demand to the task—to not point at or reach toward the thing that the subject wanted. In this case, to not point to more food over less food. Many species are so good at pointing or choosing more over less, especially for food items, that this was truly a difficult proposition—to point at what the chimpanzees would not get. And, they struggled to do this, continually pointing to the larger set, only to see it be taken away and given to the chimpanzee beside them. This study was published in the same year in which I first worked with chimpanzees, and so I tried this with the chimpanzees I had access to, thinking they would learn this “point away” rule with some experience. However, they did not, and those data went into the proverbial “file drawer” while I instead started thinking more about what it might take to show whether chimpanzees (and other primates) could engage in other forms of inhibitory control. Boysen’s task, along with a few other influences, steered me toward the topic of self-control and delay of gratification across species, and since then, I have used a number of approaches that I will briefly outline in this paper.

After the first report from Boysen and Berntson (1995), numerous other labs used the same procedure or adapted it in various ways (see Shifferman, 2009, for an earlier review of this work). I will not review all of those manipulations here other than to note that there are some variations which improve performance on the task (also see Beran, 2018, for more description of other studies using this task). The main one of which is that punishing responses to the larger set, usually by giving no reward for doing that can lead to some primates eventually learning to point to the smaller set (e.g., cotton-top tamarins: Kralik, 2012; Japanese macaques: Silberberg & Fujita, 1996), although it does not always work (e.g., cotton-top tamarins: Kralik et al., 2002) or can require very large numbers of trials before points to smaller sets occur (e.g., squirrel monkeys: Anderson et al., 2000; brown lemurs: Genty et al., 2011).

This is interesting, but I think changes the nature of the task, because the use of punishment in the form of no food reward is probably much more salient that the receipt of the unchosen set (see Silberberg & Fujita, 1996). I will return to this idea later when reviewing my lab’s most recent effort to use the reverse-reward task. The more important results, at least in terms of their impact on my thinking about inhibition and self-control in chimpanzees and other primates, came from Boysen’s studies that used symbolic stimuli to determine whether those might change the chimpanzees’ approach to the task. Recall that Boysen had trained these same chimpanzees to match Arabic numerals to quantities (Boysen & Berntson, 1989), and so she sometimes presented the numerals in place of the candy rewards, and then the chimpanzees pointed to the smaller numeral, and received the larger reward! Boysen’s subsequent work showed that this manipulation worked even better than using analog, nonedible stimuli (e.g., stones), for which the chimpanzees still struggled to perform optimally (Boysen et al., 19961999). But—if numerals were compared with candies—a surprising result emerged. Chimpanzees were more likely to point to a numeral that was smaller in value than the number of candies in the other choice option than when two candy options were presented (Boysen et al., 1999). This result is particularly important because it shows that, sometimes, animals can choose nonprimary rewards in choice tasks and can avoid reaching toward or taking the only edible items presented on a trial. It may be that the comparison of candies to numerals also allows the task to include stimuli of differing levels of salience (or prepotency) and, therefore, the animals can more easily engage in a different decisional process that is less impulsive than when both choices are highly salient primary rewards.

Self-control and delay of gratification in chimpanzees

This last result I described, along with my own interests in developmental studies with children looking at delay of gratification, sparked my first attempt to study delay of gratification in chimpanzees. I also was interested in trying to create a version of the famous Marshmallow test (Mischel, 2014; Mischel et al., 1989), in which children were instructed that if they could wait, they could earn a larger or better reward, and then they were left in the presence of a smaller or less preferred, but available, food reward. Over numerous experiments, Mischel and colleagues had documented the important contextual influences on performance, with perhaps one of the most important being that children who could somehow “transform” the way they thought about the rewards could perform very differently (e.g., Mischel, 1974; Mischel et al., 1972; Mischel & Mischel, 1983; Mischel & Moore, 1973). For example, thinking about the properties of the delayed rewards that were not focused on their taste or enjoyment would lead to much better performance than thinking about taste or how much the items were liked by the children (Mischel et al., 1972; Mischel & Baker, 1975). This result seemed very connected to Boysen’s discovery that transforming the nature of the reverse-reward task by replacing food items with symbols greatly affected performance, and so I tried to mimic as much of these factors as I could.

In my first experiment (Beran et al., 1999), chimpanzees were given a marshmallow-like test in which they were presented two items. One they could get as soon as they pressed a button, and one that was only delivered later if they waited and did not press the button. Sometimes, the items were both foods, one much more preferred than the other. However, I used photographs sometimes in place of the foods, and also lexigrams, which were symbols that some of these chimpanzees had learned to associate with different kinds of items (Rumbaugh, 1977; Savage-Rumbaugh, 1986). What I found was a mix of performances. The chimpanzees sometimes would wait for the better reward when it was a food item, and they always pressed the button when the better item was the immediate reward. The symbolic lexigram and photograph trials also produced some successes, although in this case those stimuli did not enhance performance on this test of delayed gratification the ways numerals had enhanced performance in the reverse-reward task. But, overall, the chimpanzees were able to use those stimuli effectively to obtain better rewards overall, even when they had to wait for those.

The developmental literature then offered a new direction for studies of primate delayed gratification, at least in my lab. Research with preschool-aged children had used a different variation of delayed receipt of reward in which children were given increasing accumulations of rewards, so long as they did not take any of the items already accumulated (Toner, 1981; Toner et al., 1979; Toner & Smith, 1977). The best way I know to describe the task is that it is like an interest-bearing savings account, where the money grows as long as you leave it alone. When given this task, many of the same contextual features affected children’s performances as in the marshmallow test, such as transforming the rewards in ways that highlighted their non-appetitive or hedonic value. I used this task for the first time with chimpanzees and an orangutan (Beran, 2002) and found that all of the primates could wait with dishes of food growing in front of them, suggesting another nice parallel with the development of self-control in humans. We conducted a series of studies using this task, now called the accumulation task, with chimpanzees and other primates, and fairly consistently found that chimpanzees and orangutans showed high levels of self-control (e.g., Beran & Evans, 2006; Beran, Perdue, et al., 2016b; Evans & Beran, 2007a; Parrish et al., 2014). Chimpanzees even showed some evidence of engaging in self-distraction, by using toys and other enrichment items to distract themselves from the accumulating rewards, but only when they needed to do that. When delay was imposed, and they did not have to inhibit taking the rewards, they used those toys less often (Evans & Beran, 2007b). Most recently, we showed that chimpanzees that perform best on batteries of general cognitive ability also show the highest levels of self-imposed delay of gratification, meaning they know best when they should try to delay gratification, because they are likely to be able to maintain their inhibition through a delay period (Beran & Hopkins, 2018). All of these results set the stage for one final attempt in my lab to see whether chimpanzees could solve the reverse-reward contingency task.

A return to the reverse-reward task

In that study (Beran, James, et al., 2016a), we integrated aspects of the accumulation task into the reverse-reward task. The goal was to better understand what it was about the reverse-reward task that made it so difficult for chimpanzees. The competing hypotheses were that chimpanzees struggled to point toward smaller amounts when also faced with larger amounts no matter the context because of strong inhibitory demands that they could not meet, or that it was the contingencies of the task response classes that caused problems. These ideas linked back to Boysen’s early explanations and then later work which suggested that this difficulty may not be all about inhibition (see Silberberg & Fujita, 1996).

Our first experiment involved three chimpanzees, and we replicated the basic reverse-reward task while introducing a new variation on the task. In the first phase, we drew on features of the accumulation task. A single food item was presented, as was an empty bowl. As we expected, the chimpanzees all pointed to the food item, and here is where the procedure was new. The item they pointed to was then placed in the bowl. And, then another single item was presented, and the bowl option now contained one item. The chimpanzees could choose either. If they picked the single item, it went into the bowl. If they picked the bowl, they got everything in it (at this point in the trial, that would be one food item), but the trial ended. As long as they pointed to the single item, it kept being added to the bowl, until such time as that there were 10 items in the bowl, and no single item was presented, and of course then they all would point at the bowl (if they had not already done that and ended the trial earlier).

The key feature to this variation was that by the third response in a trial, the chimpanzees were being shown one item and more than one item in the two locations, just like the reverse-reward task. To succeed and get the most food, they had to choose one item over zero, then one over one, then one over two, and so forth all the way to choosing one food item over nine items in the bowl. This last choice (nine versus one) is an extreme discrimination in terms of amounts of food, and one that, in the reverse-reward task, would lead to near universal pointing to the nine items. However, in this test, with single chosen items being added to the array that could later be collected, the chimpanzees were very good at pointing to those single items and adding them to their growing accumulation (17 of 18 total trials). As Silberberg and Fujita (1996) had suggested, chimpanzees can point to smaller food sets, but this works best when the contingencies for such pointing are not about immediate delivery of food to the subject (or a conspecific) but instead serve to increase the delayed reward.

After this success, we returned the chimpanzees to the typical version of the reverse-reward task. Chimpanzees were shown two arrays of food items, and they received whichever set they did not point toward. The outcome was a near-perfect replication of this condition in Boysen and Berntson (1995)—nearly total failure by the chimpanzees to point to the smaller amounts! They kept pointing to the larger amount (47 times out of 48 trials), and they kept getting the smaller amount. When we gave them the accumulation version again, they successfully accumulated the items (by often pointing to the smaller set) on all six trials they completed as a group. The results were among the most striking in terms of a behavioral contrast between two very similar tasks as I have ever seen. We conducted numerous other experiments in that study, and the results all converged on the idea that chimpanzees could fairly easily point to small amounts if such pointing “collected” food they would get later, but when the pointing led to immediate food delivery, they reverted to almost always pointing to the biggest amount (and never getting it). We concluded that for chimpanzees, at least, losing what one points to is difficult to deal with, whereas keeping what one points to (even if it is delayed in its delivery) allows chimpanzees to focus on the longer-term outcome rather than the present hedonic value of what sits in front of them. From the perspective of a species that shows good self-control, and that can delay gratification, this makes a lot of sense, and it links to many ideas in developmental psychology and choice behavior about “reframing” choice situations and engaging in “cool” rather than “hot” processes when evaluating the outcomes of choices (e.g., Metcalfe & Mischel, 1999). Those “cool” processes are deliberative, decisional, and sometimes are made on the basis of considering future needs (at least in humans). The “hot” processes are reactive, prepotent, and impulsive, and would lead to pointing immediately toward more food even when this is detrimental to actually obtaining more food. That chimpanzees sometimes can point to less food, so as to later obtain more food, suggests that they can engage these “cool” processes.


When I wrote Self-Control in Animals and People (Beran, 2018), I contacted Sally Boysen to see if she would be willing to share her thoughts on the reverse-reward task, and she generously agreed. I published an extensive series of her responses to my questions in that book, but I want to share just a couple here because I think they highlight her approach to comparative cognition and working with animals as much as they tell us about the “Boysen effect” (as we could perhaps call the typical results from using the reverse-reward task).

I asked her what she thought was the most important take-home message from her work with the chimpanzees using this task. She answered:

To tell you the truth, I never expected a body of experiments to come from the original tasks. At least with respect to the candy/numeral phase of the study, I think that the overwhelming power of symbols to allow us to override powerful, biological dictates is the most important take-home message, which emphasizes whenever/however that incredible evolutionary shift occurred that allowed the hominid line to begin to represent its work was the one of the most significant changes in our evolutionary trajectory. (Beran, 2018)

Regarding the early stages of using this task, when she was hoping to just teach the reversed contingencies, Boysen said this about how she thought about the chimpanzees’ performances and struggles:

It was clear from the start of testing with her [the chimpanzee Sarah] that she didn’t seem to “get it,” which we had seen with her on other tasks. I gave it the old college try, however, and ran her through the initial trials, which I think compared something like 2 vs 1 or so. I keep increasing the disparity between the candy arrays in subsequent sessions hoping that the loss of a huge amount of candy when she chose the larger array consistently might shake her up enough to pay attention to what was going on. We even had to watch ourselves so that we didn’t start to brace ourselves and hold on to the apparatus we were using. It had a metal frame and wooden flat surface which we could just push up against the front of Sarah’s home cage and present the two arrays. Sarah was getting so frustrated after each choice that she started to grab and shake the apparatus violently. After we ran the 1 vs 6 trial sessions, I gave up with her.

But at that time, we weren’t sure if it was just Sarah not getting it, or something weird going on with the task. How could the most highly-trained, highly taught chimpanzee in the world at the time NOT understand how her choice was impacting on the rewards she was getting (mostly NOT getting). Moreover, how come she could not LEARN how to get the most for her decision? She would make the same choice, over and over, and pound/shake the metal frame of the apparatus, grimace, and whine, and then 10 seconds later, make the same WRONG choice, trial after trial (I think we did 6–8 trials per session, for 2 sessions, and then changed the disparity (1 vs 2, 1 vs 4, 1 vs 6), and still NOTHING made a difference. When we finished with Sarah, I remembered something that I think is attributed to Skinner, “If you find something interesting, drop everything else and study that.” So, that’s what I did, and we immediately ran Sheba on the task—with exactly the same findings. After Sheba, I now knew that it wasn’t just a quirk of Sarah, but all our chimps were not “getting it.” Since the chimps had been using numbers for some time, I immediately thought we could “subtract” the features of the candy arrays, and see if they could learn the task using numerals. Imagine my surprise when I discovered that they didn’t have to “learn” the task at all—they just needed a cognitive conduit for expressing the optimal choice. (Beran, 2018)

It is important to note that not all efforts to use the reverse-reward task led to high levels of failure. Research with all of the great apes (chimpanzees, bonobos, gorillas, and orangutans) now has included reports of variable levels of success even without need of modification of the original task (e.g., Uher & Call, 2008; Vlamings et al., 2006). So, I think there is still a lot to be done with this task, and it has a lot of value in providing a measure of inhibitory control, decision-making, and self-control in other species, using methods that would be easy to adapt for many types of animals. Historically, it is a fascinating task because it was the assumed training phase for more complicated tests of numerical cognition in chimpanzees, and yet the results opened up new avenues for research into animal minds that steered away from the numerical focus and instead gave us a way to examine how animals deal with complicated contingencies for their actions.