Introduction

Risky choice, defined here as a choice between an option offering probabilistic outcomes and a safe(r) option offering certain outcomes, is studied in humans and nonhuman animals (animals, hereafter). Studying risky decision-making is important to understanding decision-making across a range of disciplines (Mishra, 2014) and in understanding the fundamental cognitive processes that govern decision-making in animals (e.g., Kacelnik & Bateson, 1996; Kacelnik & El Mouden, 2013). Animal models of risky choice have practical implications in understanding human risky decision-making in the real-world, and they aid in the development of interventions that can attenuate maladaptive decision-making – for example, problem gambling behavior (Paglieri, Addessi, De Petrillo, Laviola, Mirolli, et al., 2014; Zentall, 2016a).

Environmental context plays a significant role in modulating risky decision-making independently of risk biases inherent in a species (e.g., Heilbronner & Hayden, 2013; Sayers & Menzel, 2017). A well-studied factor that affects risky decision-making is the presence of a reward cue or some environmental stimulus that signals that a risky choice will result in the delivery of a reward after a short delay (Belke & Spetch, 1994; Dunn & Spetch, 1990; Fantino, Dunn, & Meck, 1979; Fortes et al., 2016; Fortes et al., 2017; Gipson et al., 2009; Hinnenkamp et al., 2017; Kendall, 1974; Laude et al., 2014a; McDevitt et al., 1997; McDevitt & Williams, 2001; Pisklak et al., 2015; Spetch & Dunn, 1987; Spetch et al., 1990; Smith & Zentall, 2016; Stagner et al., 2011; Stagner et al., 2012; Stagner & Zentall, 2010; Vasconcelos et al., 2015; Vasconcelos, Machado, & Pandeirada, 2018; Zentall et al., 2015). For example, Zentall and Stagner (2011) had pigeons choose between an option that delivered three pellets (after a 10-s delay) with a 100% probability and a less-optimal option that delivered ten pellets with a 20% probability. Under conditions where the outcome was signaled during the delay interval the pigeons detrimentally favored the risky option, and under conditions where the outcome was unsignaled, the pigeons favored the safer option. The influence of an informative delay signal promoting more risky decision-making (i.e., the “signaling effect”) also has been observed in rats (Chow, Smith, Wilson, Zentall, and Beckmann, 2017; Cunningham & Shahan, 2019; Ojeda, Murphy, & Kacelnik, 2018; however, see: Alba, Rodríguez, Martínez, Orduna, 2018; Martínez, Alba, Rodríguez, & Orduña, 2017; Trujano & Orduña, 2015), human children with developmental disabilities (Lalli, Mauro, & Mace, 2000) and a wide range of other experiments involving pigeons (see discussions in: Cunningham & Shahan, 2018; McDevitt, Dunn, Spetch, & Ludvig, 2016; Vasconcelos, Machado, & Pandeirada, 2018; Zentall, 2016b).

The signaling effect may hold important theoretical significance because it may reflect some general process(es) that produce(s) suboptimal decision-making (Cunningham & Shahan, 2018; McDevitt, et al., 2016; Vasconcelos, et al., 2018; Zentall, 2016b). Zentall (2014, 2016a) has argued that the process behind the signaling effect may be the basis for a valid animal model of problem gambling behavior. This argument has some merit – Molet et al. (2012) reported that undergraduates that self-reported a high degree of gambling were more likely to make more outcome-signal-induced risky choices in a human version of the task provided to pigeons (also see Zentall & Stagner, 2011). Researchers that evaluate problem gambling have suggested that it might be due to dysfunctional reward expectancy (Linnet, 2014; van Holst, et al., 2012). For instance, experienced poker players demonstrated good understanding of the likelihood of winning a gamble and inexperienced players demonstrated a poorer understanding; but problem gamblers, while experienced, also displayed an inaccurate anticipation of winning (Linnet et al., 2012). The Zentall and Stagner (2011) procedure may provide opportunities to study the basic processes underlying problem gambling in nonhuman animals. For example, Laude et al. (2014a, b) exposed pigeons to both a signaled risky choice task and an intertemporal choice task, and reported that pigeons that were more likely to favor the signaled risky option were also more prone to favoring the “impulsive” (smaller-immediate reward over a larger-delayed reward) option. Those individual differences correspond with some human research that shows that problem gamblers also appear to be more likely to make an impulsive choice (e.g., Alessi & Petry, 2003; Dixon, Marley, & Jacobs, 2003). However, it is necessary to ensure that the generality of the signaling effect is broad and extends beyond pigeons.

Recently, Smith, Beran, and Young (2017) generalized the signaling effect to rhesus macaques (Macaca mulatta). Adapting the design of Zentall and Stagner (2011), the monkeys worked on a computerized risky choice task where a joystick was used to select a risky option offering eight pellets at a 0.2 probability or a safe option offering two pellets at a 1.0 probability. A 9-s delay separated the choice and the outcome, and in signaled sessions a flashing color on the screen was predictive of the outcome (risky win, risky loss, or safe), while in the unsignaled sessions a flashing color was uncorrelated with the choice outcome. Six out of seven macaques showed an increased preference for the risky option in the signaled sessions. This demonstrated that the signaling effect is broad enough to affect the decision-making of an Old-World primate species. Capuchin monkeys (Cebus [Sapajus] apella), a New World primate species, were given this same experimental procedure in a pilot study to further establish the generality of the signaling effect. However, after about 20 sessions (over a month in training) the capuchin monkeys did not develop a clear preference for the risky option in the way that the rhesus monkeys had (unpublished data, see Supplementary Materials). This species difference was not likely due to different inherent sensitivities to risk because there is a risk-prone bias in probabilistic decision-making in both rhesus monkeys (Xu & Kralik, 2014) and capuchin monkeys (De Petrillo, Ventricelli, Ponsi, & Addessi, 2015). It is possible, however, that the capuchin monkeys were not attending to the stimulus presented in the delay period.

Experiment 1 in the present study was designed to test whether the failure of the signaling effect in capuchin monkeys was due to inattention to the delay signals. Experiment 1 assessed whether rhesus monkeys and capuchin monkeys anticipated the trial outcomes by presenting a modified version of the Smith et al. (2017) procedure. The modification allowed the monkeys the opportunity to shorten the delay period between the choice and outcome periods (the time that was shortened from the delay period was added to the subsequent intertrial interval [ITI], and so shortening the delay did not benefit the monkey by reducing time spent waiting). If the capuchin monkeys were anticipating reward at the end of the delay period, then they should shorten the delay period when a win was signaled and wait through the delay period when a loss was signaled. This prediction is informed by animal research using observing procedures (Wyckoff, 1952) where animals responded for food on one response option (food-response) under conditions where that food-response occasionally did not function for random and unsignaled periods of time (i.e., a mixed schedule of reinforcement). The observing procedure also included an observing-response that produced a temporary signal that was informative about whether the food-response was functional or not (i.e., a multiple schedule of reinforcement). Under these conditions, pigeons working for food (Dinsmoor, Browne, & Lawrence, 1972) and rhesus monkeys working for drug reinforcers (Woods & Winger, 2002) have shown a greater tendency to make an observing-response when that signal was informative about whether the food-response was operative (i.e., it provided “good news”) than when it was informative about whether the food-response was inoperable (i.e., “bad news”). Related to the observing procedure, the outcome signals in the risky choice task also function as good (risky-win) and bad (risky-loss) news. Similar to the observing response, for risky choices there is an asymmetry in the influence of outcome signals – while signals provide both good news and bad news, only the good news appears to affect risky decision-making (Fortes, Vasconcelos, & Machado, 2016; Laude, Stagner, & Zentall, 2014b). However, animals are not entirely insensitive to bad news. Fortes et al. (2017) demonstrated that pigeons in a long chamber (informative options on one side and uninformative options on the other) remained by the response options when a signaled-win was presented but moved away from those options when a signaled-loss was presented. Thus, signaled-losses were attended to, but they were not effective in suppressing risky choices in the same manner that signaled-wins encouraged risky choices.

Experiment 1 assessed whether outcome signals affected risky choices of rhesus (replicating Smith, Beran, & Young, 2017) and capuchin monkeys, and whether the signals appeared to affect outcome anticipation. The failure to observe robust sensitivity to the signaling effect in capuchin monkeys in the pilot study might be due to a failure of the monkeys to attend to the signals during the delays, whereas the rhesus monkeys’ risky choices should be affected by the delay signals and they should also truncate the delays for signaled-wins but not signaled-losses. Experiments 2 and 3 were follow-up experiments that focused on the question of whether the outcome signals affect risky decision-making functionally through an outcome expectancy process (consistent with the idea that “gambling” is mediated through outcome expectations) or an alternative process such as conditioned responding to stimuli correlated with reward. Table 1 presents a summary of the three experiments to help outline their rationale and progression.

Table 1 Experimental overview. The Sessions and Trials columns show the scheduled numbers, but monkeys worked at their own pace, and so they ultimately determined the total trials based on how often they finished all trials within sessions

Experiment 1

Experiment 1 compared risky-choice data from capuchin and rhesus monkeys on a task that was identical to the Smith et al. (2017) procedure in all respects except one. In the present task, the monkeys could truncate the delay period by moving a cursor to remove up to three on-screen card stimuli that each represented a 6.67-s delay segment separating the choice and outcome periods. The amount of time truncated during the delay period was added to the subsequent ITI to ensure that delay truncation did not increase the rate of trial exposure and opportunities to obtain pellets. The monkeys should show more risky choices in the signaled sessions (vs. unsignaled sessions) and be more likely to truncate the delay during a signaled-win (vs. signaled-loss) delay period. If the capuchin monkeys show poor sensitivity to the outcome signals due to inattention to those signals’ correlation with the outcome, then we would anticipate those monkeys will fail to show undifferentiated risky choices between signaled and unsignaled sessions along with undifferentiated delay truncations between signaled-win and signaled-loss outcomes.

Smith et al. (2017) evaluated the effect of prior choice outcomes on the monkeys’ following choices and reported that the monkeys tended to follow their prior choice regardless of the outcome. That is, the monkeys returned to the risky option after experiencing a risk outcome, even if that outcome was a loss (signaled or unsignaled). The present study will determine whether the monkeys continue to follow their prior choices. It is possible that the delay truncation response will increase sensitivity to the prior outcome (because the truncation response might increase attention to the delay signals) and therefore may reduce preference for the risky option following a risky-loss outcome. If this occurs, it is possible that the rhesus monkeys will show a greater sensitivity to the outcome signals in the present study than in the Smith et al. (2017) study. Finally, the risky option is designed to be suboptimal, so preference for the risky option should result in fewer reward pellets earned.

Method

Subjects

Six adult male rhesus macaque monkeys (Macaca mulatta; Han, Hank, Lou, Luke, Obi, and Murph) and three adult tufted capuchin monkeys (Cebus [Sapajus] apella; one male: Griffin, and two females: Star and Gonzo) participated in this experiment. All six of the rhesus monkeys had previously participated in the Smith et al. (2017) experiments. Two of the capuchin monkeys (Griffin and Star) participated in the pilot study. Research time for additional capuchin monkeys was not available for Experiment 1 due to other testing priorities with those monkeys. Monkeys were not food-deprived or weight-reduced and they had access to water ab libitum. Rhesus monkeys started working at 0900 h for 4 to 5 days a week and capuchin monkeys began working at 1030 h for approximately 3 days a week. For Experiment 1, the capuchin monkeys had a different and unrelated cognitive task that preceded this present task. The present task followed when the alternative task was completed and often required 60 to 120 min to complete the task, depending upon the monkey. This study complied with approved Georgia State University Institutional Animal Care and Use Committee (IACUC) protocols and the United States Department of Agriculture Animal Welfare Act, and the “Guidelines for the Use of Laboratory Animals.” Georgia State University is accredited by the Association for Assessment and Accreditation of Laboratory Animal Care (AAALAC).

Apparatus

The apparatus consisted of a personal computer with color monitor (800 × 600 screen resolution), digital joystick, and food pellet dispenser. The joystick controlled a cursor displayed on the monitor, and the computer was programmed to deliver banana-flavored pellets (94 mg for macaques and 45 mg for capuchins; Bioserve, Frenchtown, NJ, USA), as a consequence for correct responses, through a dispenser interfaced to the computer using a relay box and output board (Keithley Instruments, Cleveland, OH, USA). Rhesus monkeys (Richardson, Washburn, Hopkins, Savage-Rumbaugh, & Rumbaugh, 1990) and capuchin monkeys (Evans, Beran, Chan, Klein, & Menzel, 2008) had been trained to engage this computer system and had all participated in a large number of other studies of learning and cognition.

Procedure

The experimental design and the program determining the procedural contingencies were identical for both monkey species. Sessions ran for 120 trials or until 4 h elapsed, whichever occurred first. All monkeys completed 50 sessions. No pretraining procedures were used to aid in task acquisition. All trials included a choice period, a delay period, an outcome period, and an ITI (Fig. 1). Trials started in the choice period (Fig. 1A) with a red dot icon at the bottom-center portion of the screen representing the joystick-controlled cursor and two clipart icons on the top-left and top-right portion of the screen representing the safe and risky options. The left/right locations of the safe and risky clipart icons were randomized across trials. New clipart icons were randomly selected to represent the risky and safe options across sessions (i.e., monkeys had to reacquire the icons’ choice representation each session). Once the monkeys moved the cursor to contact the icon, a choice was registered and the trial proceeded to the (three-segment) delay period. Upon transitioning to the three-segment delay period, three horizontally arranged “flashing card” stimuli (each card representing a 6.67-s delay segment) were presented on the lower-bottom part of the screen, and the cursor was presented on the top middle part of the screen (Fig. 1B). The card stimuli had an arbitrary symbol and the cards alternated every 0.25 s between a positive and negative image of a symbol (giving the impression that the cards were flashing; Fig. 1C). Progressing through each delay segment involved the cards flashing on the screen until either 6.67 s elapsed or the cursor was moved to contact the largest (left-most) card on the screen (i.e., truncation response to shorten the delay segment). In the first segment all three flashing cards were visible and the left card was largest (Fig. 1B). In the second segment the left card was not present, the middle flashing card was large, and the right flashing card was small (Fig. 1D). In the third segment the left and center cards were not present, and the right flashing card was large (Fig. 1E). The delay period lasted at most 20 s (if the “truncation response” was not used) or at the least 3 s if all three delay segments were truncated to shorten the delay period (it took approximately 1–1.2 s to move the cursor from the top center location to the largest card on the screen). The color and symbol on the cards were varied depending upon the signaling condition for the session (described below).

Fig. 1
figure 1

An illustration of the time through the trial for Experiment 1. A. Choice period with the joystick-controlled cursor (red dot), the risky option, and the safe option. B. The first segment of the delay period with three cards flashing C. Flashing involved all the cards alternating between a positive and negative version of the color/symbol every 0.25 s at the bottom of the screen and the cursor located at the top center of the screen. D. The second segment of the delay period. E. The third segment of the delay period. F. The outcome and intertrial interval period involved a blank white screen. Note: The visible cards continued to flash (e.g., C) throughout the delay period, including the second (D) and third (E) delay segments

If the monkey moved the cursor to select the risky option, then following the delay period the monkey would receive eight pellets with a 0.125 probability or zero pellets with a 0.875 probability. If the monkey selected the safe option, then following the delay period the monkey would receive two pellets with a 1.0 probability. Following the outcome period, there was an ITI that lasted for at least 5 s plus any time that was deducted from the delay period (e.g., if the monkey truncated the 20-s delay down to 7.5 s, then the ITI would last 17.5 s [5+12.5]). Thus, total trial duration was kept approximately constant. Each session started with four forced-choice trials (two safe outcome, one risky-win, and one risky-loss outcome randomly determined) and additional forced-choice trials were placed on trials 41–44 and 81–84 (i.e., placed in session thirds). During forced-choice trials the two choice icons were presented, but if the monkey chose the icon for the unforced option the trial reset until the monkey chose the icon for the forced option.

Across sessions the monkeys were placed into signaled sessions or unsignaled sessions, randomly with the restriction that a signaled/unsignaled session could not occur for three or more consecutive sessions. For the signaled sessions, the card colors and symbols were predictive of the outcome. Following the selection of the risky option, cards included a red smiley face predictive of an eight-pellet risky-win or yellow skull and crossbones predictive of a zero-pellet risky-loss. Following the selection of the safe option, cards included a blue heart (□) that was presented at a 0.125 probability and predictive of two-pellet outcome and a green star presented at a 0.875 probability and predictive a two-pellet outcome. The purpose of the separate card stimuli and probabilities of presentation used for the two-pellet safe outcome was to equate the variance in card stimuli between the safe and risky options. It is worth noting that the signaled/unsignaled delays should only really affect the risky option because the safe option always resulted in two pellets and therefore the monkeys should always be capable of discriminating the outcome for choosing the same option. For the unsignaled sessions, the card colors and symbols were different following the selection of a safe or risky option but were uninformative about the outcome of the trial. Following the selection of the risky option, a gray question mark (?) was always presented on the card stimuli (regardless of the subsequent outcome). Following the selection of the safe option, a black exclamation point (!) was always presented on the card stimuli.

Data analysis

The final 40 sessions were included in the analysis for Experiment 1 because we expected that the first ten sessions likely involved the monkeys still learning about the experimental task and its contingencies. Overall, 66% of the sessions were fully completed. Out of 4,800 possible trials the monkeys completed the following number of trials: Obi (3,477), Han (3,895), Hank (4,165), Luke (4,146), Murph (3,605), Star (3,513), Griffin (2,629), and Gonzo (3,570). Incomplete trials were likely a result of this task being paired with an alternative task in a daily session (see Subjects section) and rewards from the alternative sessions reducing motivation in this experiment.

For Experiments 13, the data were analyzed using generalized linear mixed-effects models using the lme4 package in R (Bates, Maechler, Bolker, & Walker, 2016). To meet the repeated-measures assumption and to characterize individual subject data, all factors were included as random-effects at the individual subject level (Gelman & Hill, 2006). The random effects structure allowed the slope and intercepts (but not interactions) to vary independently for each individual subject. All categorical factors were effect-coded and continuous factors were centered to address issues of non-essential multicollinearity. Likelihood ratios were computed between the complete model and a reduced (or null) model that excluded a factor (Jaroz & Wiley, 2014; Wagenmakers, 2007). Likelihood ratios are reported to indicate which model performed better. In circumstances where a model did not clearly perform better, the simpler model was favored. Post hoc tests (using the emmeans package in R; Lenth, 2018) were reported to highlight the specific differences relevant to the hypotheses, using the Wald test to produce p-values (α = 0.05). Because many of the models include multiple factors and their interactions, to keep the report unburdened with excessive detail only statistical results that are of theoretical relevance or explain why the best model was accepted will be reported. The Supplementary Materials include thorough detail of the fixed effects from the models and the R scripts associated with each model.

Risky choice was modeled (Risk1) as a function of the signal factor (categorical: signaled vs. unsignaled), species factor (categorical: capuchin vs. macaque), and trial factor (continuous: trials 1–120). Risky choice was binary outcome data (1 = risky option, 0 = safe option) and the Risk1 model specified a binomial distribution. The species factor, a between-subjects factor, was not allowed to vary across subjects in the random effects. Risky choice was further evaluated using an outcome history model (History1) that determined how the prior outcome affected subsequent choices. History1 modeled risky choice as a function of the signaling factor, species factor, and prior-outcome factor (risk-win, risky-loss, and safe outcome). Only data from the latter part of the session (trials ≥ 80) were used in the History1 model.

To confirm that the risky option was the suboptimal option a model determined the conditions where the monkeys earned the most pellets. The average number of pellets earned per session was estimated by modeling the number of pellets accumulated per session (because pellet number is count data, the model specified a Poisson distribution) as a function of the signaling factor and the species factor (Food1). For each subject and signaling condition the risky choice estimates (from Risk1 model) was correlated (Pearson’s r) with the average number of pellets earned in a session (from the Food1 model).

The probability of truncating the delay for each delay segment was analyzed to determine whether the monkeys were sensitive to the outcome signals in the delay period. A truncated delay was defined as any delay segment where the monkey contacted the card to shorten the delay segment (segments ≤ 6.6 s = 1, truncated; segments > 6.6 s = 0, untruncated). The Truncation1 model predicted the probability of truncating the delay as a function of signal factor, species factor, delay segment factor (segments 2 and 3), outcome factor (win, loss, and safe), and their interactions. Due to a programming error the first delay-segment could not be reported because time to truncate the first delay and time to make a choice could not be distinguished.

Results and discussion

Risk1

The best fitting model was the full model that was over 1,000 times more likely than the model excluding the signal factor and trial block factor (Table 2). The full model was 12 times more likely than the model excluding the species factor. Figure 2 shows the probability of selecting the risky option across trial-blocks, signaling in sessions, and species. There were no differences in risky choices between the signaled and unsignaled sessions for the first session block (z = -0.89, p = 0.38). This was anticipated because the clipart icons that represented the risky and safe options were new at the start of each session and the monkeys would not initially know which options were associated with the risky/safe outcomes. Preference for the risky option increased across trials for signaled sessions, but not for unsignaled sessions (trial × signal two-way interaction; z = 9.77, p < 0.001). That relationship was stronger for rhesus monkeys compared to capuchin monkeys (z = -3.14, p < 0.001). The post hoc planned comparisons showed that the differences in risky choices between signaled and unsignaled sessions increased across trials for rhesus monkeys (z = 10.85, p = 0.002) and capuchin monkeys (z = 4.11, p < 0.001). For the final 40 trials, rhesus (50% vs. 19%) and capuchin monkeys (61% vs. 55%) were more likely to choose the risky option in the signaled sessions.

Table 2 The likelihood ratios and results of a likelihood ratio test for all models in Experiment 1. The Best Model is the model that is compared against all Comparison models that differ by removing a factor. The Likelihood reports how much more likely the best model is to the comparison. The chi-square statistic results report whether the best model performed statistically better than the comparison
Fig. 2
figure 2

Proportion of risky choice as a function trial species (capuchin vs. rhesus) and signaling condition (signed or unsignaled sessions). Bar graphs represent the group average ± SEM (fixed effects). Overlaid line plots (random effects) show individual subjects. Individual subject plots are jittered to aid in visualizing potentially stacked plots

History1

The full model did not perform statistically better than a reduced model excluding the species factor. The reduced model performed better than the model excluding the signal factor and prior-outcome (risky-win, risky-loss, safe) factor (Table 2). In the unsignaled sessions, preference for the risky option was greater following a risky-win compared to a safe outcome (45% vs. 23%, z = 3.91, p < 0.001) and risky-loss compared to a safe option (50% vs. 23%, z = 2.90, p = 0.01). Preference for the risky option was not greater following a risky-win versus a risky-loss outcome (45% vs. 50%, z = -0.66, p = 0.79). In the signaled sessions, preference for the risky option was greater following a risky-win compared to a safe outcome (67% vs. 46%, z = 4.09, p < 0.001). Preference for the risky option was not greater following a risky-win versus a risky-loss outcome (67% vs. 67%, z = 0.07, p = 0.99) and risky-loss compared to a safe option (67% vs. 46%, z = 2.14, p = 0.08). These results were consistent with what was reported in Smith et al. (2017), where preference tends to track the prior choice regardless of the risky outcome.

Food1-Risk1 correlation

There was a negative correlation between the average number of pellets earned in a session for a monkey and the average proportion of risky choices for a monkey (r(14) = -0.79, p < 0.001). This confirmed that the monkeys’ preference for the risky option in the signaled sessions was not driven by a pellet-optimization strategy.

Truncation1

The full model was the best model (Table 2); this indicates that signaling, species, delay segment, and outcome factors were all important in determining truncation responses. Truncation responses were slightly more likely in the third delay segment than the second delay segment (z = -2.52, p = 0.01) and there was a three-way interaction where the monkeys were less likely to truncate the risky-loss delay in the signaled sessions, and this effect was greater for capuchin monkeys (z = 4.72, p < 0.001). Figure 3 shows the probability of making a truncation response for the third delay segment as a function of species, signal condition, and outcome. Post hoc tests showed that for both monkey species in the unsignaled sessions there were no differences in truncation between risky-win and safe outcomes (capuchin, 96% vs. 93%, z = 1.27, p = 0.40; macaques, 99% vs. 98%, z = 1.36, p = 0.35), risky-win and risky-loss outcomes (capuchin, 96% vs. 95%, z = 0.43, p = 0.89; macaques, 99% vs. 99%, z = 0.45, p = 0.89), and risky-loss and safe outcomes (capuchin, 95% vs. 93%, z = -0.82, p = 0.68; macaques, 99% vs. 99%, z = -1.48, p = 0.30). In signaled sessions there were no differences in truncation responses between risky-win and safe outcomes (capuchin, 91% vs. 94%, z = -0.37, p = 0.92; macaques, 98% vs. 95%, z = 1.73, p = 0.19). However, there were fewer truncation responses for signaled-losses than signaled-wins (capuchin, 24% vs. 91%, z = 4.91, p < 0.001; macaques, 63% vs. 98%, z = 5.20, p < 0.001) and signaled risky-loss and safe outcomes (capuchin, 24% vs. 94%, z = 8.11, p < 0.001; macaques, 63% vs. 95%, z = 6.61, p < 0.001). Overall, the monkeys were biased towards truncating the delays in all unsignaled trials (~ >50%) and only avoided delay truncation under conditions where a risky loss was signaled. The rhesus monkey Hank was the only subject that truncated the delay period when a signaled risky loss was presented.

Fig. 3
figure 3

Proportion of delay periods where the monkeys used the cursor to truncate the third delay segment as a function species (capuchin vs. rhesus), signaling conditions (signaled vs. unsignaled), and trial outcome (risky-win, risky-loss, and safe). The other plot characteristics (subject legend, plot fill, error bars, plot jittering) are identical to Fig. 2

The choice data from the rhesus monkeys generally replicated what was reported in Smith et al. (2017), including the absence of an outcome-signal effect with Hank, which was also true in the previous report. Interestingly, Hank made the effort to truncate the delays during the risky-loss signal; this suggests that perhaps his insensitivity to the signaling stimulus on his risky choices might be due to an inattentiveness to the information conveyed by the delay signal. Besides Hank, all monkeys avoided truncating the delay when a signaled-loss outcome was presented. This indicates that the monkeys were generally attending to the information conveyed by those stimuli. The Risk1 model reported that there were no differences in risky choice patterns between capuchin and rhesus monkeys. These results contrast with the pilot data showing no signaling effect on risky choices in capuchin monkeys. It is not clear why there was such an improvement. Perhaps the availability of the delay-truncation contingency increased attention to those stimuli and that affected risky choices. However, capuchin monkeys’ risky choices were less sensitive to the outcome signals compared to rhesus monkeys. Also, only three capuchin monkeys were included in this experiment due to limited access to more animals at the time of the experiment. For this reason, we extended our investigation with this species and included new monkeys.

Experiment 2

Experiment 1 demonstrated that the monkeys would differentially truncate the delay prior to reward delivery and these truncation responses might reflect the monkeys’ outcome expectancy. The use of the truncation response due to anticipation of the reward implies that the behavior is governed by conscious/explicit judgments about the forthcoming outcomes. The investigation into conscious-like processes in nonhuman animals generally comes from research in metacognition. Prior research has demonstrated that nonhuman primates will avoid unnecessary effort if they do not expect a reward (Beran et al., 2015) or if they expect reward without effort (Beran et al., 2013; Call & Carpenter, 2001; Hampton et al., 2004). It is possible that the use of the truncation response was reflecting an outcome expectancy, but reward stimuli can serve different functions. Alternatively, truncation responses could simply reflect a conditioned attraction to stimuli that signal reward (e.g., sign-tracking; Tomie, Brooks, & Zito, 1989). Experiment 2 is primarily interested in determining whether the monkeys use of the delay truncation response was more consistent with a metacognitive-like behavior or not.

Nonhuman primate research focusing on possible instances of metacognition has shown that monkeys and apes can communicate their expectations using behavioral tasks (Smith, Beran, Couchman, Coutinho, & Boomer, 2009; Smith, Couchman, & Beran, 2012, 2014). A variety of methods assess metacognition in primates, but the truncation response is most similar to the “confidence response” method. Beran et al. (2015) gave chimpanzees a computerized task and placed the reward-dispensing station a distance away from the computerized-task station. This arrangement required the chimpanzees to promptly move to collect the reward after making a correct task response, otherwise the reward would be forfeited. They found that, even before any feedback came from the computerized test, the chimpanzees were more likely to move from the workstation to collect the reward when they made a correct response than an incorrect response. This suggested that, at some level, chimpanzees had a metacognitive awareness of whether they had answered the trial correctly or not, which was manifested through these confidence movements. Recently, we have demonstrated that capuchin monkeys would also show similar “confidence movements” in a metacognitive task where they were more likely to travel a distance to collect a reward following a correct response than an incorrect response (Smith et al., under review). Thus, capuchin monkeys do display metacognitive-like abilities when anticipating the receipt of reward.

In Experiment 1, when monkeys shortened the delay period this may reflect the monkeys’ “expectancy” of reward. In the unsignaled conditions, where the monkeys waited through the delay period in ignorance of the outcome, the use of the delay truncation response might reflect an awareness of the outcome if they truncate only certain trials. To test this possibility, Experiments 2 and 3 replaced the safe option with a simple circle-size judgment task, but following the task there was still a delay period and the reward outcome of the task was either signaled or unsignaled (like when the risky choice was selected). Task difficulty was varied across conditions to determine whether the monkeys were more likely to truncate the delay in unsignaled conditions when the task was easy and the likelihood of earning food was high. If the monkeys’ use of the delay truncation response reflected a metacognitive process, then when the outcome was unsignaled the monkeys might have been more likely to shorten the delay period when they correctly anticipated a reward. Under conditions where the outcomes of a choice were signaled, the monkeys’ reward anticipation should be driven by the information provided by the signal; however, under conditions where the outcomes were unsignaled, the monkeys’ expectations could be mediated by an endogenous (or private) stimulus (Hampton, 2009).

The inclusion of easy or difficult task options as an alternative to a risky choice also allows for the ability to assess the effects of work difficulty on risky decision-making. The monkeys should earn fewer pellets under the difficult task conditions and be more likely to favor the risky option under those conditions. It is not known if task difficulty will interact with signaling conditions to affect risky choices.

In Experiment 1 the outcome signals produced only a weak tendency to favor the risky options in capuchin monkeys, and this might be due to the monkeys being content to choose carelessly because both options occasionally produce food. Beran et al. (2016) demonstrated that capuchin monkeys would not display uncertainty responses (i.e., escaping a challenging task to avoid a time-out, but forfeiting the opportunity to earn food) on a two-option discrimination task (50% chance of correcting guessing a correct response), but would occasionally use the uncertainty response on a six-option discrimination task (16.7% chance of a correct guess). Extrapolating that outcome to the present study, the two-choice task may have resulted in inattentive responding in capuchins or apathy about the outcomes that were not technically “wrong” (both options had a decent pellet return). Thus, in Experiment 2, six foil options (choice icons that only reset the trial) were randomly intermixed with the two choice options to encourage the monkeys to attend to their options. If the capuchins were inattentive to their options in Experiment 1, then this procedural adjustment should increase the difference in risky choice between the signaled and unsignaled conditions.

Method

Subjects

Seven capuchin monkeys (five males: Benny, Griffin, Logan, Mason, and Nkima; and three females: Star, Bias, and Gambit) participated in Experiment 2. Unlike Experiment 1, the capuchins had a full 4 h to complete a daily session because of differences in lab-based schedules for animal testing with each species. Gonzo and Star were not included in Experiment 2 because they became inconsistent in entering the testing box during Experiment 2. Choosing to work is voluntary for these monkeys, and they sometimes opt not to work on certain tasks. Research time was unavailable for rhesus monkeys to permit their inclusion in Experiments 2 and 3.

Apparatus

The same apparatus described in Experiment 1 was used for Experiment 2.

Design and procedure

Generally, the procedure used for Experiment 2 matched the procedure described for Experiment 1 (Fig. 1), but with the six following modifications:

  1. (1)

    Sessions lasted for 600 trials, or until 4 h elapsed, and were divided into four 150-trial phases. The four 150-trial phases randomly presented four conditions that varied the signaling and task-difficulty factors in a 2 × 2 factorial design (signaled easy-task, signaled hard-task, unsignaled easy-task, and unsignaled hard-task). The monkeys continued in the experiment until at least 2,000 trials of each trial-phase was completed (i.e., 8,000 total trials completed) and 93% of the sessions were fully completed. This modification was included to test multiple experimental conditions within a single session and because the monkeys had more session time to complete these 600 trials compared to Experiment 1.

  2. (2)

    The choice screen presented eight options, six foil icons randomly mixed with two icons representing the two choice options (Fig. 4A and B). The cursor was presented centrally on the screen and the eight icons were arranged around the center with two above, two below, two to the left, and two to the right. The foil icons were all boxes with a black alphabetical symbol on a white background and the choice icons were random colored symbols with a colored background. Thus, the choice options stood out from the array, but the monkeys still needed to attend to the whole array to make a specific choice. Unlike Experiment 1, the choice icons representing the options were the same across sessions but differed depending upon whether the monkeys were experiencing an unsignaled phase (e.g., Fig. 4A) or a signaled phase (e.g., Fig. 4B).

  3. (3)

    The alternative to the risky option no longer delivered two pellets with a 1.0 probability, but rather led to a circle-size judgment task. The task period arranged four circles horizontally near the top portion of the screen and a cursor at the bottom center portion of the screen (e.g., Fig. 4C). The diameters of the circles varied across trials (1.9–7.6 cm) and one of the circles had a large diameter relative to the other three circles, which had identically sized diameters. If the largest (correct) circle was selected, then the trial proceeded to the delay period with flashing cards and then to the outcome period where two pellets were delivered accompanied by a chime sound. If a small (incorrect) circle was selected, then the trial proceeded to the delay period and ended without pellet delivery accompanied by a buzzing sound. For the signaled phases, the delay period differentially signaled depending upon whether the circle selection in the task period was correct or incorrect. In unsignaled phases the flashing cards in the delay period provided no information about whether the task selection was correct or incorrect.

    Fig. 4
    figure 4

    Illustration of choice and task screens used during Experiment 2 and Experiment 3. Panels A and B show the choice screens for the unsignaled (Panel A) and signaled (Panel B) session components. The central red dot is the cursor, the alphabetical card stimuli are foil choices, and the two choice card stimuli are risky and task/safe choices (this replaced the choice period in Fig. 1A for Experiments 2 and 3). Panel C shows the circle-judgment task screen where the red dot is the cursor used to move and select the largest circle on the screen to make a correct response in Experiments 2 and 3. Panel D shows the yoked-task choice that was used during yoked session components in Experiment 3 the cursor selected the central square and pellets were delivered at a probability yoked to the probability if earning a pellet in the immediately prior judgment task component

    The difficulty of the task was varied across task phases within a session. In easy-task phases the difference in diameter between the target circle and the foil circles varied between 2.2, 2.5, and 2.9 cm, whereas in hard-task phases the difference in target and foil circle diameters varied between 0 (impossible to distinguish), 0.3, and 0.6 cm.

  4. (4)

    The card stimuli were changed so that monkeys were randomly assigned to having one of two sets of card stimuli to represent the choice options, the signaled-phase card stimuli, and the unsignaled-phase card stimuli. Figure 5 shows the two sets of card stimuli used for the choice period and delay period. For the unsignaled phases the choice option stimuli and the flashing delay segment stimuli were identical (e.g., the gray “?” card represented the risky option and was also the flashing card presented during the delay period). For the signaled phases the delay segment card stimuli differed based upon rewarded task trials (i.e., risky-wins or correct circle selections) and unrewarded task trials (i.e., risky-losses or incorrect circle selections). For both sets in the signaled phases, the symbol for the choice card stimuli in the task and risky options were a composite of the delay card stimuli (e.g., a “–” risky-win delay card symbol and a “|” risky-loss delay card symbol combined to make a “+” risky choice option symbol).

    Fig. 5
    figure 5

    Illustration of the set of card stimuli used in Experiment 2 and Experiment 3. The rows show the cards used for risky or safe/task options for Set 1 and Set 2. The columns show the cards presented during the choice periods and delay periods for each set. The first column shows the choice cards used in the unsignaled session components representing both the risky and safe/task options in the choice phase and the flashing card stimuli used the delay period (choice stimuli and delay stimuli were the same during unsignaled session components). The second row shows the card stimuli in the choice phase for the signaled session components. The third and fourth rows show the card stimuli used during the delay periods of the signaled session components. Row three shows stimuli used during signaled rewarded trials (e.g., risky-wins or correct task trials) and row four shows stimuli used during signaled unrewarded trials (e.g., risky-losses and incorrect task trials).

  5. (5)

    The delay segments were shortened from 6.67 s to 3.33 s per card. Thus, the delay period could be as short as 3 s (if the monkeys promptly moved the cursor to truncate each delay segment) or as long as 10 s (if the monkeys waited through all three delay segments). As in Experiment 1, the time saved from the delay truncation was added to the next 5-s ITI. The delay period was shortened to allow the monkeys to complete all 600 trials within a daily session.

  6. (6)

    Forced-choice trials were assigned sometimes depending upon the monkeys’ preferences to prevent the monkeys from completely avoiding an undesired option. If a monkey made five consecutive choices to any one option, then the subsequent trial would force them to choose the neglected alternative. During forced-choice trials, the neglected icon was presented along with the six foils, but the location for the unavailable option was kept blank.

Data analysis and hypotheses

Experiment 2 analyses included 13 sessions (8,000 possible trials) and 93% of the sessions were fully completed. Out of 8,000 possible trials the monkeys completed the following number of trials: Benny (8,000), Bias (6,844), Gambit (6,773), Griffin (6,268), Logan (7,376), Mason (7,149), and Nkima (8,000). Because subjects did not share a daily research session with another task (as was true in Experiment 1), the number of completed trials/sessions is much higher. The lack of 100% completion is due (to our knowledge) to random factors that interrupted the capuchins’ attention to the task.

Experiment 2 utilized the modeling techniques that were reported in Experiment 1. The four inter-leaved conditions within a session required the monkeys to have a few trials to adapt to the condition changes (note that monkeys required several trials to adapt to a new session in Experiment 1). Thus, only the last 100 trials within a 150-trial session phase were analyzed to ensure that the monkeys’ responses were sufficiently representative of the current phase’s contingencies. For the choice model (Risk2), risky choice was analyzed as a function of the signaling factor (signaled vs. unsignaled trials) and task-difficulty factor (easy- vs. hard-task trials) to determine whether risky choices were more likely in the signaled trials compared to the unsignaled trials, and whether risky choices were more likely in the hard-task phase compared to the easy-task phase. As in Experiment 1, the average number of pellets earned in a session was analyzed as a function of the signaling and task-difficulty factors (Food2). The average number of pellets earned for each subject was correlated with the average probability of making a risky choice for each subject in each signal phase and task phase to determine whether preferences for the risky option came at the cost of a reduction in the number of pellets earned. Risky choice was also evaluated in terms of sensitivity to the prior trial outcome (History2). The History2 model analyzed risky choice as a function of the signaling factor, task-difficulty factor, prior-choice factor (task vs. risk), and prior-outcome factor (risky-win/correct-judgment vs. risky-loss/incorrect-judgment).

The probability of making a correct task-response (binomially distributed data; 1 = correct, 0 = incorrect) was modeled (Accuracy2) as a function of the task-difficulty factor and the signaling factor. The probability of making a correct response should be lower in the hard-task compared to the easy-task condition, and the probability of a correct response should be at chance levels for the subset of impossible-trials in the hard-task condition.

As in Experiment 1, the delay-truncation model (Truncation2) predicted the probability of truncating a delay as a function of the signaling factor, task-difficulty factor, delay-segment factor (first, second, and third), trial-outcome factor, and choice factor (risky or task). In Experiment 2 the trials were coded as truncated (1) if the delay was less than 3.3 s and coded as non-truncated (0) if the delay was 3.3 s or longer. If the monkeys were sensitive to the outcome information signaled card stimuli (as in Experiment 1), then the monkeys should be more likely to truncate the rewarded trials (risky-win or correct-task trials) than the unrewarded trials (risky-loss or incorrect-task trials). If the monkeys demonstrated an awareness of their task performance, then in the unsignaled session phases they should truncate the delay prior to a task-correct outcome, but not the task-incorrect outcome. Furthermore, the monkeys should not be able to differentially truncate unsignaled delays between correct and incorrect trials in the subset of “impossible” task trials in the hard-task condition where the target circle was impossible to identify from the foils.

Results and discussion

Risk2

The full model accounted for the data better than the models without the signal and condition factors (Table 3). The monkeys were more likely to favor the risky option under signaled trials than under unsignaled trials (z = 5.36, p < 0.001) and under the hard-task trials than under easy-task trials (z = -2.76, p = 0.005). The monkeys favored the risky option on signaled-easy trials more than on signaled-hard trials (signal × task-difficulty interaction; z = 5.94, p < 0.001). Figure 6 shows that the monkeys clearly favored the risky option when the outcomes were signaled (vs. unsignaled) in both the easy- (77% vs. 49%; z = 6.14, p < 0.001) and the hard- (78% vs. 60%; z = 4.47, p < 0.001) task difficulty conditions. There were more risky choices in the hard-task trials when the outcomes were unsignaled (60% vs. 49%; z = -3.53, p = 0.002), but there were no differences in risky choices between easy- and hard-task trials when the outcomes were signaled (77% vs. 78%; z = -1.95, p = 0.20). Overall, outcome signaling clearly increased risky choices in capuchin monkeys and risky choices were more likely in hard-task conditions, but only when the outcomes were unsignaled. Outcome signals made the monkeys insensitive to task difficulty. These effects were found in all monkeys except for Gambit, who nearly exclusively favored the risky option across all conditions.

Table 3 The likelihood ratios and results of a likelihood ratio test for all models in Experiment 2. The Best Model is the full model that is compared against all comparison models (Factor) that differ by removing a factor. The Likelihood reports how much more likely the best model is to the comparison. The chi-square statistic results report whether the best model performed statistically better than the comparison
Fig. 6
figure 6

Proportion of risky choice selections as a function of task-difficulty factor (easy vs. hard) and signaling factor (signed or unsignaled sessions) in Experiment 2. The other plot characteristics (plot fill, error bars, plot jittering) are identical to Fig. 2

History2

The full model performed better than the models excluding the signal, condition, and prior outcome factors (Table 3). There was an outcome × choice interaction (z = -12.69, p < 0.001) where monkeys were more likely to choose the risky option following a risky-win (vs. risky-loss) and following an incorrect-task outcome (vs. correct-task outcome). There was an outcome × signal × choice interaction (z = -3.42, p < 0.001) where increased risky choices following a risky-win or an incorrect-task outcome were greater in the signaled sessions compared to the unsignaled sessions. Predominantly, in Experiment 2, earning pellets in the risky or task choices increased the likelihood of returning to that option and that tendency was greater in the signaled sessions than the unsignaled sessions.

Food2-Risk2 correlations

To determine whether choices were driven by maximizing reward, the average number of pellets earned (Food2) was correlated against the average proportion of risky choices (Risk2) in a condition and signal factor for each subject. This analysis found a statistically significant negative correlation between total pellets obtained in a session and proportion of risky choices within a session for easy-task trials, r(12) = -0.86, p < 0.001. However, when evaluating the hard-task trials, there was not a statistically significant correlation found, r(12) = 0.21, p = 0.45. Thus, as in Experiment 1, preference for the risky option was not explained by the risky option delivering more pellets, on average, than the alternative.

Accuracy2

In Experiment 2, the overall number of pellets earned from selecting the task option was determined by the degree of task accuracy. The full model was 164 times more likely than the model excluding the signal factor and over 1,000 times more likely than the model excluding the task-difficulty factor (Table 3). There was a main effect of task difficulty where the accuracy was much higher in the easy-session phases than in the hard-session phases (>80% vs. <35%; z = 7.93, p < 0.001). There was also a signal × task-difficulty interaction (z = 3.91, p < 0.001) where accuracy was higher in the easy-signaled (vs. unsignaled) phases (86% vs. 82%) and lower in the hard-signaled (vs. unsignaled) phases (30% vs. 33%). The task difficulty had a considerable effect on task accuracy in the expected direction and the signaling factor did not produce a consistent effect on accuracy. The hard session phase included a subset of impossible trials where the monkeys could only choose at random (one out of four), and here the monkeys performed at chance accuracy compared to the difficult-but-possible trials (25% vs. 35%; z = -6.54, p < 0.001).

Truncation2

Experiment 2 included a delay truncation response to determine whether the monkeys’ trial expectations were sensitive to the outcome signals (Fig. 7). The full model performed better than the reduced models, indicating that outcome signaling, task-difficulty, delay-segment, choice, and outcome factors were all significant in determining the probability that the monkeys would truncate the response (Table 3). The probability of truncating the delay period was smaller when pellets were not forthcoming (i.e., risky-loss or incorrect judgment; z = -3.35, p < 0.001), when outcomes were unsignaled (z = -2.13, p = 0.03), and particularly when a signaled-loss stimulus was presented (outcome × signal interaction; z = -40.89, p < 0.001). Generally, the reduction in delay-truncation responses for signaled-losses was greater after risky choices than task choices (choice × outcome × signal interaction; z = -10.91, p < 0.001). The reduced probability of truncating a delay in unsignaled trials was greater under unsignaled-hard trials than unsignaled-easy trials (condition × signal interaction; z = -3.13, p = 0.001). There were fewer delay-truncation responses in the first delay segment than in the second (z = 2.35, p = 0.01) and third (z = 2.34, p = 0.01) delay segments. The relatively fewer delay truncations in the first delay segment were more common following a task choice than a risky choice (delay-segment × choice interaction; z = -11.02, p < 0.001) and when the outcome was signaled during the delay period (delay-segment × choice × signal interaction; z = 8.29, p < 0.001).

Fig. 7
figure 7

Proportion of delay periods where the monkeys used the cursor to truncate the first delay segment as a function of task-difficulty factor (easy vs. hard), signaling factor (signaled vs. unsignaled), choice factor (task vs. risky), and outcome factor (risky-win/ task-correct, risky-loss/task-incorrect) in Experiment 2. The other plot characteristics (plot fill, error bars, plot jittering) are identical to Fig. 2

Generally, for the second and third delay-truncation periods the monkeys would truncate the delays when pellets were signaled (win or correct judgment, >80%) or when the outcome was unsignaled (>70%), suggesting that the monkeys were engaged in the truncation response. However, the first delay-segment, when the monkeys first chose to initiate the delay-truncation response, showed more variance (Fig. 9). Post hoc tests indicated that there were statistically significant reductions in delay truncation between risky-win and risky-loss outcomes for signaled-easy (34% vs. 2%; z = -5.53, p < 0.001) and signaled-hard (32% vs. 2%; z = -5.07, p < 0.001) risky-choice session phases, and between correct-judgment and incorrect-judgment outcomes for the signaled-easy (25% vs. 3%; z = -3.44, p < 0.001) and signaled-hard (27% vs. 2%; z = -4.46, p < 0.001) task session phases. There were no differences in delay truncation between risky-win or risky-loss outcomes for the unsignaled-easy (72% vs. 64%; z = -0.67, p = 0.49), or unsignaled-hard (68% vs. 65%; z = -0.26, p = 0.79) session phases following a risky choice. Also, there were differences in delay truncation between a correct-judgment and incorrect-judgment (39% vs. 20%; z = -1.66, p = 0.09) following a task choice in the unsignaled-hard session phases. However, following a task choice for the unsignaled-easy session phases, there were fewer delay truncation responses following an unsignaled-loss versus an unsignaled-win (43% vs. 12%; z = -2.95, p = 0.003).

These results suggest that the monkeys were able to anticipate the differential likelihood of earning pellets based upon performance on the circle-size judgment task, even though the outcomes were unsignaled. This outcome was only found in the easy-task phases, but the hard-task phases included a subset of “impossible” trials where the monkeys should not have been able to anticipate pellet reward likelihoods based upon performance. When evaluating the data based upon the trial types, there is no difference in the probability of truncating the delay based upon outcome for the impossible trial types (26% vs. 17%, z = -0.77, p = 0.44), but there was a greater likelihood of truncating a correct-judgment than an incorrect-judgment when the trials were possible (35% vs. 16%, z = -1.96, p = 0.04).

Overall, as in Experiment 1, the monkeys selectively truncated signaled risky-win delays, demonstrating sensitivity to the outcome information. Furthermore, the monkeys’ risky choices were affected by the prior outcome. Risky choices were more likely following a risky-win and following an incorrect task-judgment, showing a win-stay-lose-shift pattern (Harlow, 1949). The influence of past trial outcomes was greater when the outcomes were signaled. That signaled prior outcomes increase subsequent choices raises the possibility that signals enhance the salience of prior outcomes and/or makes the events more memorable.

The tendency for the monkeys to truncate delays prior to a correct task response in the unsignaled session phase is novel. This would seem to support the argument that in the absence of an explicit outcome signal the monkeys had a metacognitive-like expectation of not earning a pellet prior to an incorrect response. The hypothesis that this reflects such an expectation is supported by the data where the monkeys did not differentially truncate wins and losses following unsignaled risky choices (where probability rather than task performance determined reward) and for the unsignaled impossible task trials (where performance was at chance levels and differential reward expectations would not be possible). If the monkeys’ metacognitive expectation of reward was influencing use of the delay-truncation response, then that raises the possibility that the delay truncation response is measuring the monkeys’ expectation of reward when the outcomes are signaled rather than simply measuring monkeys making some conditioned response to reward-correlated stimuli or measuring the monkeys truncating the delays to simply to proceed through the trial.

However, caution regarding a metacognitive interpretation of the delay-truncation response is warranted for several reasons. First, conclusive experimental evidence of metacognition has been notoriously difficult (but not impossible) to find in capuchin monkeys (see Smith, Smith, & Beran, 2018). Second, the effect was only found in the first delay segment of a three-segment delay. Third, the effect could be considered weak, as the difference in the probability of truncating the delay was around 40% (unsignaled-win) and 15% (unsignaled-loss). Therefore, Experiment 3 was designed, in part, to replicate these effects in Experiment 2.

Experiment 3

Experiment 3 was primarily concerned with replicating the results from Experiment 2 with additional control conditions. Experiment 3 included the easy-task condition in Experiment 2 (hereafter referred to as the work condition) but replaced the hard-task condition with a yoked condition. For the yoked trials, the task offered a single box to contact (Fig. 4D) that provided pellets at a probability that was yoked to the probability of earning pellets in the easy task. Comparing risky choices in the work condition to the yoked condition can determine the contribution of the task contingencies independent of the rate of pellet reward. The yoked condition was essentially a safe(r) risky choice option with approximately a 0.8 probability of earning two pellets (based on the results of Experiment 2 the easy-task contingencies were used in Experiment 3 to allow for an approximate 0.8 probability of earning two pellets for the non-risky alternative).

It is also possible that there might be differences in risky choices between the work and task trials, although it is not immediately clear whether the two task contingencies may affect risky choices or interact with the signaling conditions. It is possible that the monkeys would make more risky choices in the yoked condition compared to the work condition because the easy-task option offered an “illusion of control” to earn two pellets that the yoked option does not. That is, the work condition permitted the monkeys to earn pellets based upon their own behavior (i.e., ostensibly the monkeys had a 100% possibility of earning reward at the moment they entered the easy-task period, if they could perform accurately). However, reward in the yoked condition was entirely determined by a probability gate. This prediction is supported by studies that show nonhuman primates preferring to choose their own tasks to complete rather than have the tasks selected for them (see Perdue, Evans, Washburn, Rumbaugh, & Beran, 2014; Washburn, Hopkins, & Rumbaugh, 1991). But, in the present case, the monkeys might prefer the task option (relative to risky option) under conditions where the monkeys’ performances determine reward.

Experiment 3 further determined whether the monkeys were demonstrating some degree of metacognitive awareness about whether their task response was correct or not. If the delay-truncation response reflects an expectation of reward, then one would expect that the monkeys in the unsignaled trials would continue to avoid truncating the delays more often when an incorrect task response was made in the work condition, but not when a pseudo-incorrect outcome was probabilistically assigned in the yoked condition or when an unsignaled risky-loss outcome was assigned.

Finally, Experiment 3 progressed through two stages to determine whether allowing the monkeys to truncate the delays affected their risky decision-making. The first Go-Truncation Stage allowed the monkeys to truncate the delay period by using the cursor (as in Experiments 1 and 2) and the following No-Truncation Stage did not allow the monkeys to shorten the delay period (i.e., the three cards were presented, but the cursor was not available to make a truncation response and all of the delay segments elapsed after 3.34 s). The No-Truncation Stage was included to assess the influence of the delay-shortening contingencies on risky choice to rule out the possibility that the capuchin monkeys favored the risky option in the signaled condition due to their tendency to reduce the delay and potentially cause them to erroneously learn that the delay-truncation response was causing the risky-win jackpot. If the signal-induced preference for the risky option was not due to the truncation response (as with rhesus monkeys in Smith et al., 2017) then there should be no difference in risk preference between the Go-Truncation Stage and the No-Truncation Stage.

Method

Subjects

Seven capuchin monkeys (four males: Benny, Griffin, Mason, and Nkima; and three females: Bias, Gambit, and Star) participated in Experiment 3. Logan was removed because changes in his housing situation prevented his participation. Star was included because her testing cooperation was sufficient to produce a complete data set.

Apparatus

The same apparatus described in Experiment 1 was used for Experiment 3.

Procedure

Experiment 3 lasted for at least 12,000 trials per subject (6,000 trials per experimental phase) and 98% of all sessions were fully completed. The procedure used for Experiment 3 was identical to the procedure described in Experiment 2, but with the following two exceptions:

  1. (1)

    The work trials were identical to the easy-task trials in Experiment 2. The hard-task trials were replaced with yoked-task trials. The yoked trials presented a single square box centrally located where the four circles were generally located (Fig. 4D). Once the cursor contacted the square there was a probabilistic chance that two pellets were delivered at the end of the delay. The assigned probability was yoked to the obtained probability of earned pellets in the immediately prior work phase (within the same session). The work phase necessarily preceded the yoked phase in order to determine the reward probability for the yoked phase. The order of presentation for the signaled or unsignaled phases was randomized.

  2. (2)

    Experiment 3 progressed through two stages. The first Go-Truncation Stage permitted delay truncation during the delay period as in Experiment 2 and the following No-Truncation Stage was the same as the Go-Truncation Stage, but the delay period could not be shortened. The cursor was not presented, and each card elapsed after 3.34 s for a total of a 10-s delay. The No-Truncation Stage followed the Go-Truncation Stage for all subjects.

Data analysis and hypotheses

Experiment 3 included 20 sessions (10 Go-Truncation Stage sessions and 10 No-Truncation Stage sessions) and 90% of the sessions were fully completed. Out of 12,000 (6,000 Go-Truncation Stage; 6,000 No-Truncation Stage) possible trials the monkeys completed the following number of trials: Benny (5,251; 6000), Bias (5,765; 5,850), Gambit (6,000; 6,000), Griffin (5,250; 5,859), Mason (4,200; 3,510), Nkima (6,000; 4,500), and Star (4,952; 4,945). Most monkeys completed the majority of the sessions. Mason’s reduction in session completion correlated with disruptions within his social group. Monkey Star’s reduction in session completion could be due to similar social group disruptions or her advanced age for a capuchin monkey.

We conducted similar linear mixed-effects modeling analyses to those reported in Experiment 2. The proportion of risky choice was modeled (Risk3) as a function of the signaling factor (signaled or unsignaled), task-type factor (work or yoked), truncation-stage factor (Go Truncation or No Truncation), and their interactions. To determine whether preferences for the risky option came at the cost of a reduced number of pellets earned, the average number of pellets earned in a session (Food3) was correlated with the average probability of making a risky choice for each subject as a function of the signaling, task-type, and truncation-stage factors. The probability of making a correct task response (or pseudo-correct task outcome for the yoked-condition) was modeled (Accuracy3) as a function the task-type, truncate-stage, and the signaling factors. This was done to confirm that the probability of reward was the same between the work and the yoked conditions and that accuracy was not affected by the opportunity to truncate the delay or not.

Delay truncation responses were modeled for the Go-Truncation Stage in Experiment 3 (Truncation3). The model predicted the probability of delay truncation as a function of the signaling factor, task-type factor, choice factor (risky or task), trial-outcome factor (win/correct or loss/incorrect), and delay-segment factor (first, second, or third delay-segment). If the delay truncation response was a product of some metacognitive-like process, then during the unsignaled conditions the probability of truncating the delay following a correct circle judgment was higher than following an incorrect circle judgment. However, the metacognitive-like outcome should only be present in the work trials.

Results and discussion

Risk3

The full risky choice model was over 1,000 times more likely than the reduced models that excluded the signal condition, task type, and the phase condition (Table 4). Figure 8 shows the proportion of risky choices across all session phases and stages. The monkeys were more likely to choose the risky option under the signaled phases across all conditions (z = 4.92, p < 0.001). Specifically, between the signaled and unsignaled trials, the monkeys were more likely to favor the risky option in the Go-Truncation stage for the work (59% vs. 21%; z = -3.57, p < 0.001) and yoked (62% vs. 33%; z = -3.57, p < 0.001) task trials. Also, between signaled and unsignaled trials, the monkeys were more likely to favor the risky option in the No-Truncation stage for the work (55% vs. 28%; z = -3.57, p < 0.001) and yoked (53% vs. 31%; z = -3.57, p < 0.001) task trials. There was a condition × signal × phase interaction (z = 5.47, p < 0.001) where the difference in the proportion of risky choices between the signaled and unsignaled phases was larger for the yoked task phases in the Go-Stage (62% − 33% = 29% difference opposed to 59% − 37% = 22% difference), but smaller for the yoked phases in the No-Stage (53% − 31% = 22% difference opposed to 56% − 29% = 27% difference). The effect of outcome signaling on encouraging risky choice was observed in most monkeys across all conditions, but Gambit and Nkima did deviate from the group in some cases. Nkima showed a greater probability of choosing the risky option in the No-Truncation phase compared to the Go-Truncation phase. Gambit made fewer risky choices in the No-Truncation phase than the Go-Truncation phase, and she showed an apparent reverse signal effect in the No-Truncation phase. Those two monkeys likely accounted for the three-way interaction. Finally, the average proportion of risky choice in Experiment 3 (Risk3) was correlated with the proportion for risky choice in Experiment 2 (Risk2) for monkeys that were common to both experiments (Benny, Bias, Gambit, Griffin, Mason, and Nkima). There was a statistically significant correlation between Risk2 and Risk3 predictions of risky choice between subjects, r(72) = 0.47, p < 0.001. Comparing Figs. 6, and 8 shows some of the similarities in risky choice patterns.

Table 4 The likelihood ratios and results of a likelihood ratio test for all models in Experiment 3. The Best Model is the model that parsimoniously accounted for the most variance compared against all comparison models that differ by removing a factor. The Likelihood reports how much more likely the best model is to the comparison. The chi-square statistic results report whether the best model performed statistically better than the comparison
Fig. 8
figure 8

Proportion of a risky choice selection as a function of the task factor (task vs. yoked), signaling factor (signed vs. unsignaled sessions), and the truncation-stage factor (Go-Truncate vs. No-Truncate delay periods). The other plot characteristics (plot fill, error bars, plot jittering) are identical to Fig. 2

History3

The model that excluded the task type factor (work vs. yoked) was the best fitting model and was over 54 times more likely than the full model (Table 4). The best fitting model was over 1,000 times more likely than the model that excluded the prior-outcome, the choice, the signaling, and the truncation-stage factors. Risky choices were more likely following a risky-win (vs. risky-loss) and following an incorrect task-judgment (vs. correct task-judgment)(choice × prior choice interaction; z = -9.24, p < 0.001), and this relationship was stronger in the signaled sessions than in the unsignaled sessions (choice × prior choice × signal interaction; z = -4.63, p < 0.001). The History3 model replicated the effects reported in the History2 model.

Food3-Risk3 correlation

Consistent with what was found in Experiment 1 and the easy-task trials in Experiment 2, there was a negative correlation between total pellets obtained in a session and proportion of risky choice within a session, r(54) = 0.87, p < 0.001. The monkeys favored the risky option at the expense of maximizing pellets.

Accuracy3

The null model that did not include the signal, task-type, or truncation-stage factors performed better than the full model (Table 4). Like the easy trials in Experiment 2, task accuracy was 80 ± 5% regardless of trial signals, task-type, and truncation-stage. This confirms that the yoked task trials provided pellets at the same probability as the work task trials (as they were programmed to do) and that the opportunity to truncate the delay did not affect task performances.

Truncation3

To continue to assess the monkeys’ trial outcome expectation, the truncation response was included in the Go-Truncation Phase in Experiment 3. The best model was the full model including the signal, task-type, delay-segment, choice, and outcome factors (Table 4). The probability of truncating the delay was less likely when no pellets were forthcoming (z = -3.37, p < 0.001), when the outcomes were signaled (z = -3.4, p < 0.001), and especially when unrewarded outcomes were signaled (outcome × signal interaction; z = -26.15, p < 0.001). There were fewer delay truncation responses in the first delay segment than in the second (z = 2.90, p = 0.003) and third (z = 4.34, p < 0.001) delay segments. The difference between the first-delay segment and the third delay-segment was greater following a task choice than a risky choice (choice × delay-segment interaction; z = -12.43, p < 0.001), especially under unsignaled sessions (choice × delay-segment × signal interaction; z = -26.15, p < 0.001).

As in Experiment 2, there was a lower probability of a truncation response in the first delay-segment where the monkeys would have the first opportunity to choose to truncate the delay or not (Fig. 9). When the trial outcomes were signaled there were fewer truncation responses preceding a signaled-loss outcome following a risky choice (work trials: 5% vs. 21%; z = -3.57, p < 0.001; yoked trials: 5% vs. 21%; z = -3.57, p < 0.001) and a task choice (work trials: 2% vs. 30%; z = -5.70, p < 0.001; yoked trials: 8% vs. 24%; z = -3.08, p < 0.001). When trial outcomes were unsignaled there were no differences between win and loss outcomes following a risky choice (work trials: 34% vs. 44%; z = -0.89, p = 0.37; yoked trials: 43% vs. 55%; z = -1.103, p = 0.27) and a yoked task choice (25% vs. 26%; z = -0.14, p = 0.88). However, consistent with the findings from Experiment 2, with unsignaled task choices the monkeys were less likely to truncate the delay following an incorrect (loss) choice than a correct (win) choice (11% vs. 26%; z = -2.37, p = 0.01).

Fig. 9
figure 9

Proportion of delay periods where the monkeys used the cursor to truncate the first delay segment as a function of the task factor (free vs. work), signaling factor (signaled vs. unsignaled), and outcome factor (risky-win/task-correct, risky-loss/task-incorrect) in Experiment 3. The other plot characteristics (plot fill, error bars, plot jittering) are identical to Fig. 2

Thus, as with Experiment 2, the monkeys were (on average) less likely to truncate an unsignaled risky-win outcome following a correct work response, and this effect did not appear on the unsignaled yoked trials where the monkeys could never have confidence in the probabilistically determined outcome.

General discussion

This research sought to further establish that outcome signals increase risky choice in monkeys and that monkeys would truncate the delays whenever they anticipated that the outcome would result in food pellets. Experiment 1 demonstrated that capuchin and rhesus monkeys favored the risky option when the outcomes were signaled and would shorten a delay period if a risky-win outcome was signaled but wait through the delay if a risky-loss was signaled. This demonstrates that the monkeys were sensitive to the delay-period stimuli correlated with the upcoming outcome, but it does not necessarily demonstrate that the monkeys were using the stimuli as information about the upcoming outcome.

Experiment 2 further demonstrated that capuchin monkeys’ risky choices were increased with outcome signals under conditions where the “safe” option led to a perceptual judgment task of varying difficulty. As in Experiment 1, the capuchins were less likely to truncate the delay when a loss outcome was signaled. The capuchins generally truncated the delays when the outcomes were unsignaled, except when they made an incorrect selection in the circle judgment task. This use of the truncation response suggested that the monkeys might avoid truncating delays when they did not expect a reward. If true, this could reflect a metacognitive-like mechanism where the monkeys consciously decided to truncate the delays rather than truncating the delays out of some conditioned response. However, a follow-up experiment was required to confirm the reliability of these effects. Experiment 3 confirmed that the monkeys were less likely to truncate an unsignaled delay when the monkeys made an incorrect judgment but not when the task delivered reward probabilistically. Experiment 3 also confirmed that the availability of the delay-truncation response did not meaningfully affect preference for the risky option. Finally, Experiment 3 determined that risky decision-making was not influenced by whether the task option involved a perceptual judgment task or a yoked probabilistic reward task.

Experiments 2 and 3 modified the traditional risky choice procedure by replacing the alternative to the risky option, often a safe(r) option, with a task that required the monkeys to “work” to earn pellets. In some ways this modification makes this procedure a better model of human gambling behavior, because for humans in society the alternative to earning pay through gambling is to earn it through labor. Experiment 2 roughly explored economic conditions on risky choices by including an easy or hard task to work for pellets. The monkeys were less likely to choose the risky option when the alternative to a gamble was a difficult task, although this was only the case when the outcomes were unsignaled. However, it is uncertain whether this outcome was due to the task difficulty or confounded with the reduction in number of pellets earned. Although making the monkeys work for reward as an alternative to gambling may improve the ecological validity of this gambling model, future research should assess the effects of token rewards on choices between a risky-gamble option and a safer-work option. Monkeys do work for token rewards that can be exchanged for food rewards, and they are sensitive to risky contingencies with token rewards (see Zakrzewski et al., 2014), making this a feasible and productive line of future research.

The models investigating the effect of the previous trial demonstrated that in Experiment 1 the monkeys tended to return to the option that they had previously chosen, regardless of the outcome. This replicated the results from Smith et al. (2018) but with capuchin monkeys as well as the same rhesus monkeys that were involved in that study. This was a curious failure of the monkeys to shift their choices following a risky-loss outcome (signaled or unsignaled). Experiments 2 and 3, however, did show some interesting effects of the prior outcome. In both experiments the capuchin monkeys were responsive to the trial outcomes. That is, compared to prior risky-win outcomes, monkeys were less likely to choose the risky option following a risky-loss; and compared to a prior correct-judgment outcome, monkeys were less likely to choose the task option following an incorrect-judgment outcome. They were even more responsive to trial outcomes when those trials were signaled. Thus, in Experiments 2 and 3, the monkeys followed more of a win-stay-lose-shift strategy. Why Experiment 1 failed to show a similar effect (for risky choices at least) is uncertain. It is possible that, because the safe option always delivered two pellets, the certainty of that outcome reduced overall sensitivity to the outcome contingencies. The fact that the signaled trials produced a greater effect is interesting, and it suggests that the outcome signaling enhanced the impact of those outcomes. The mechanism behind that enhancement is unclear (e.g., motivational, enhancing short-term memory for the event, etc.). Regardless, even though the signaling increased sensitivity of risky losses to decrease risky choices, the monkeys remained biased towards choosing the risky option when the outcomes were signaled.

In Experiment 3 it was hypothesized that there may have been more risky choices in the yoked condition compared to the work condition because the work condition might have offered an illusion of control. Prior research has shown that rhesus monkeys and capuchin monkeys preferred to choose which computer task they engaged in over being forced to engage a task, while equating the rates of reward (Perdue et al., 2014). In the work condition, the possibility of earning two pellets for selecting the task option was putatively under the monkeys’ control – if they performed accurately, then they earned the pellets. However, in the yoked condition, the possibility of earning the two pellets was always up to a probability gate and outside of the monkeys’ control for that specific trial. However, the monkeys did not show a difference in preference for the risky option in the work condition compared to the yoked condition (regardless of the signaling condition). That is not to say that such an outcome could not be found, but it would probably require a more straightforward procedure to assess it. The task contingencies and the yoked contingencies should be compared directly as the only two options available. Such a procedural modification would be closer to the procedure used by Perdue and colleagues.

The delay truncation response provided a method to assess the monkeys’ sensitivity to the information conveyed in the delay stimuli. All three experiments demonstrated that the monkeys, from both species, would generally truncate the delays prior to a signaled-win and wait out the delays prior to a signaled-loss. Overall, the monkeys were sensitive to the information conveyed by those explicit outcome signals. A novel finding from these experiments is the apparent sensitivity to the task outcomes in the unsignaled session phases. The tendency for the capuchin monkeys to wait through the delay period after the monkeys made an incorrect circle-size judgment selection, while generally truncating the delays after a correct response or when rewards were probabilistically delivered, demonstrated that the monkeys had an outcome expectation independent of any programmed signal and those expectations may have come from some endogenous/private/metacognitive signal that provides a sense of what is to come (Hampton, 2009; Smith et al., 2009). In Experiment 2 this effect was found overall for the easy-task trials. For the hard-task trials, this effect was only found for the subset of trials where it was possible for the monkeys to make a correct judgment. For the subset of “impossible” task trials, where the target circle was indistinguishable from the foils, the best the monkeys could do was guess (and, indeed, their accuracy was at chance levels). In those impossible trials the monkeys were equivalently likely to truncate the delays prior to both a correct and an incorrect response. All of these effects were only observed in the first delay segment, whereas in the following delay segments the monkeys generally truncated delays regardless of task accuracy. Overall, the differential likelihood of truncating the delays as a function of task accuracy in the unsignaled trials is consistent with a hypothesis that the monkeys’ truncation responses were governed by a metacognitive process. However, there were reasons to be cautious about that outcome. Namely, the effect was restricted to the first segment, and the research supporting capuchin monkey metacognition is inconsistent – they do not always demonstrate metacognitive patterns, and when they do the effect is not as strong as what has been observed in rhesus monkeys (e.g., Beran, et al., 2014; Smith, Beran, Couchman, Coutinho, & Boomer, 2009; Smith, Smith, & Beran, 2018).

Because of the curious, but cautiously interpreted, outcome in Experiment 2, Experiment 3 sought to replicate the metacognitive-like effect in capuchins with a yoked-task condition where the monkeys could not have a metacognitive hunch about whether a given trial would produce reward or not. Experiment 3 demonstrated that some monkeys were not likely to truncate the unsignaled delay following an incorrect response in the work condition, whereas the monkeys were likely to truncate the unsignaled delays in the comparable yoked condition that assigned “pseudo-incorrect” outcomes. However, the present study did not include all of the appropriate control conditions to permit a definitive conclusion that a metacognitive process can solely explain these data. Future research should assess whether the use of the delay truncation response, which is akin to the reported go-when-you-know metacognitive assessment of outcome confidence in chimpanzees (Beran, et al., 2015) and capuchin monkeys (Smith et al., under review), will spontaneously generalize across a variety of tasks that promote uncertainty (see Brown, Templer, & Hampton, 2017; Kornell, Son, & Terrace, 2007).

Collectively, these data are consistent with the hypothesis that the monkeys’ use of the delay truncation response may reflect an outcome expectancy process that can be informed by explicit external signals or metacognitive-like endogenous signals. This supports the idea that the truncation response is measuring the monkeys’ outcome expectancy, rather than a conditioned response to stimuli that predicts reward. If the signals influence risky decision-making by affecting their outcome expectation, then that further strengthens the link between the signaling effect being implicated in gambling behavior since gambling behavior can be considered a disorder of reward expectation (Linnet, 2014; van Holst, et al., 2012). And, these results support the idea of behavioral work with nonhuman primates as a valuable model for understanding human gambling behavior in a comparative perspective.

Open Practices Statement

The data and materials for the experiments are made available. The R scripts are included in the Supplementary Materials and the data are located online at the Open Science Framework (osf.io/76fzc). See Supplementary Materials for additional details not included in the article. None of the experiments were preregistered.