Introduction

Our ability to detect changing situations and our capacity to adapt our decisions accordingly may determine the success of the choices we make. Choice options that were initially rewarding might become less rewarding in the future, or vice versa; and safer alternatives might become riskier over time, while riskier options might become safer. For example, stock market traders must successfully detect changes in the market to stay on top of the best prices. Traders must track and adapt their investment decisions according to the available assets, their expected probability of success, and the stock values. All of these aspects may result in the increase or decrease of stock values over time, making some investments potentially more profitable than others.

However, it is difficult to detect changing situations because often we do not observe the information that would signal a change in the environment and we often rely on our own experience, i.e., the outcomes of similar decisions we have made in the past (Behrens, Woolrich, Walton, & Rushworth, 2007; Gonzalez et al., 2003). To adapt choices to changing environments, individuals need to be able to track the statistical structure of the environment and be able to adapt their learning accordingly (Behrens et al., 2007). For example, the stock market investor must track the changing signals in the market in order to determine the most profitable investment portfolios. With reliance on experience, there are at least two aspects that will influence the way people learn and adapt their choices to changes in the environment: the feedback from the decisions made will signal whether a decision resulted or could have resulted in a good outcome; and the direction (i.e., trend) in which a change occurs will determine whether the environment is improving or worsening. For example, the past trend of a market, the outcome of past investment decisions, and whether a stock’s value is expected to increase or decrease over time will influence the trader’s choices.

Past research in behavioral decision research (discussed below), suggests that humans adapt their choices slowly to volatile environments. Researchers have used controlled choice environments, usually repeated binary choices, with full feedback (where the outcomes of the chosen and unchosen option are revealed) or partial feedback (where the outcomes of only the chosen option is revealed); and they have also included diverse patterns of change of the value of the choice options over time. Current results point to “stickiness” (i.e., over-reliance on initial experiences) as an inhibitor to adaptation, which could lead to low awareness depending on the kind of feedback and changes in the trend patterns. For example, a trader may adapt more successfully to market changes when an investment was initially better but worsens over time compared to changes in an initially worse but improving investment, because the trader was not investing in the worse option in the first place. This trend effect may be influenced by the feedback and the outcomes that are actually observed as decisions are made in a volatile environment.

In this paper, we advance current research on choice adaptation by clarifying the trend effects according to different types of feedback and information that signals a change in the environment. Specifically, we look at the performance and individual strategies before and after the options change while learning in a binary choice task. We determine how adaptation strategies under different trends may be influenced by the type of feedback (full or partial) and by a direct observation of the information that changes (probabilities or outcomes). Our results from two experiments help to replicate and advance the effects of the trend, feedback, and observability of outcomes in dynamic environments, improving our knowledge of the conditions that influence adaptation of choice.

Trend effects and adaptation to change

The trend of change in the environment (whether a decision option was initially better but worsens over time or whether it is initially worse but improves over time) has been studied in binary choice tasks where one of the options is stationary and another one is non-stationary (Rakow & Miler, 2009). In their experimental paradigm, the probability of receiving a high outcome in a non-stationary option changed over multiple trials using various trends. Their results suggest that participants adapt slowly to changes in the environment mostly due to stickiness in which initial experiences impact later choices and may inhibit adaptation to changes in the probabilities. Participants overall preferred the option that was initially higher in expected value, inhibiting their adaptation in later choices. They used multiple games of repeated binary choice, where the non-stationary choice’s probability varied in its pattern of change over time. For some patterns, the probability changed from stable to increasing to stable again, and for other patterns, the probability changed from stable to decreasing to stable again. Thus, for increasing probability patterns, the non-stationary option was initially lower in expected value than the stationary option but then became higher. For decreasing probability patterns, the non-stationary option was initially higher in expected value than the stationary option but became lower. Rakow and Miler’s findings suggested an effect of trend on adaptation. Specifically, their first experiment revealed fewer switches away from a decreasing probability option than toward an increasing one. However, Rakow and Miler’s experimental design did not manipulate the trend alone, and their observation of stickiness did not replicate in their own experiments. Thus, the authors carefully proposed (but did not conclude) that decision makers better adapted to an improving (initially unfavorable option that improves over time) than to a worsening (initially favorable option that worsens over time) trend of change. The authors left the study of the trend and its cognitive explanation open for future research.

Recent studies have corroborated and extended the initial observations from Rakow and Miler. Rakow and Miler’s discussion of recency in non-stationary environments has also received subsequent attention. Recency is an over-reliance on one’s most recent experiences, and studies have pointed concretely to recency as a key to facilitate the adaptation of choice in changing conditions of the environment. For example, Lejarraga et al. (2014) suggested that recency may drive better adaptation of choices made by individuals (compared to groups). They found that individuals adapted to change more successfully than groups in gamble trends with decreasing probability of the best outcome, but not in one with increasing probability. Their argument was that individuals, having “poorer” memory than groups, rely more on recent than earlier experiences, which leads them to adapt better to change than groups do. However, this effect was not robust to all increasing trends, leading Lejarraga et al. (2014) to leave the test of the trend effect for future research.

In an attempt to clarify the effect of trend on adaptation to change, and the role of stickiness and recency, researchers have directly manipulated the trend in experiments using a binary choice task similar to that of Rakow and Miler, where the probabilities of obtaining a high outcome in a non-stationary option change continuously over time (e.g., linearly increased or decreased; see Cheyette et al., 2016; Konstantinidis, Harman, & Gonzalez, 2022). Current results indicate slower choice adaptation to increasing probabilities than to decreasing probabilities, and an emerging preference for a sure option when the alternatives are stationary. Their results are also supported by parameters derived from cognitive models of experiential choice (Cheyette et al., 2016; Gonzalez et al., 2003; Konstantinidis et al., 2022). Konstantinidis and colleagues fit models to individual participant data, and observed that individual differences in recency-of-memory parameters were associated with successful adaptation. These findings suggest that individuals who were more forgetful of past experiences were better at adapting to changing environments, in contrast to the face valid argument that more information about the history of events helps decision makers adapt better. We extend these recent investigations by determining how the trend effect may be influenced by the type of outcome feedback and by the observability of the element of change.

Feedback effects and observability of outcomes

Feedback is an important aspect of learning and choice behavior in sequential decision tasks (Erev & Barron, 2005; Gonzalez et al., 2003). Specifically, environments with partial feedback force decision makers to choose between exploring the outcomes of choice options to obtain more useful information and exploiting choice options based on already known information (Camilleri & Newell, 2011; Yechiam & Rakow, 2012). In contrast, full feedback removes the need for choice-based exploration, allowing decision makers to focus on exploiting the available choice options, and also to better adapt to harsh environments (Rakow et al., 2015). These behavioral patterns are also supported by neuroscientific evidence: there are strong correlations between the knowledge of forgone outcomes (i.e., full feedback) and brain activity in areas involved with valuation and choice (Lohrenz et al., 2007). Thus, we would expect that choice adaptation would depend on the type of feedback provided.

Rakow and Miler (2009) used full feedback, and as discussed above, they found evidence for a potential effect of trend on adaptation, explained partially with the stickiness effect. In contrast, recent investigations used only partial feedback paradigms (Cheyette et al., 2016) and found an effect of the trend on adaptation explained by the recency effect.

Other related studies have directly compared the effects of partial and full feedback in volatile environments, but not the effects of the changing trend. Avrahami et al. (2016) found differences in choice behavior when full versus partial feedback was provided in a task where the expected values of two choice options flipped multiple times. They found that, when full feedback was provided, participants showed a greater preference for the riskier of the two choice options, and their choices were influenced by the outcome of the riskier choice. In comparison, when partial feedback was provided, participants’ choices were only influenced by outcomes when their current choice obtained a low outcome and the last outcome of the forgone option was a high outcome.

More generally, the effects of the type of feedback may be directly linked to the observability of the change itself. For example, recent research has found that participants who are more sensitive to probabilities and outcome values can better respond to changes in these features (Wulff, Mergenthaler-Canseco, & Hertwig, 2018). Also, Ashby and Rakow (2016) found that, with long sequences of choices, participants often fail to observe some of the outcomes. In particular, participants fail to observe the forgone outcomes more often than the obtained outcomes and more as the task progresses. These results suggest that participants’ ability or propensity to observe the change will influence the way they identify changes in the task, particularly when changes occur later in the decision task.

In binary choice tasks, changes in underlying probabilities are only inferred from the frequency with which various outcomes occur. This sets an upper limit on how accurately participants can adapt to changes in probabilities, even if participants pay full attention to the information provided. Research has found that the volatility of a reward environment influences the way individuals adjust their choices (Behrens et al., 2007), and when there is variability to the sensitivity to probability values, most participants making decisions from experience are least sensitive to differences in moderate rather than high probabilities (between roughly 0.4 and 0.6) (Kellen, Pachur, & Hertwig, 2016).

In contrast to past studies on changes in probabilities, we offer a novel way to investigate the effect of the observability of change, by making the probabilities fixed and the outcomes variable. Given that the outcome values are observed directly, we expect that changes in outcomes will be quicker and easier to detect compared to changes in probability. Thus, the type of change faced (whether in probabilities or outcome values) is expected to influence adaptation to different trends of change.

Summary

To summarize, we extend current research by determining how the trend effect of change adaptation may be influenced by the type of feedback and the observability of the element of change (i.e., probability or outcome).

In Experiment 1, we experimentally manipulate the trend (direction) of the change in probability using a continuous linear function, and the type of feedback provided in choices between two options: one with stationary probabilities and one with non-stationary probabilities. In Experiment 2, we manipulate the trend of changing outcome values instead of probabilities to test the effect of directly observing change. Experiment 2 presents gamble pairs with constant probabilities and changing outcomes while keeping the overall expected values and manipulated variables (trend and feedback) similar to Experiment 1.

Given the literature reviewed above, we expect a main effect of the trend, where a change in probabilities (or outcomes) would be easier to detect when the changing decision option is decreasing (initially better but worse over time) rather than increasing (initially worse but better over time). We also expect a main effect of feedback, where full feedback would result in better adaptation to change compared to partial feedback. Finally, we expect that changing outcome values will facilitate adaptation and reduce the effect of trends compared to changing probabilities over time.

Experiment 1: Detection of probability change

Method

Participants

We tested 600 participants (Mage = 35.5, SDage= 11.2 years; 323 female), recruited from Amazon Mechanical Turk. Participants were compensated solely based on their performance in the choice task (1 cent for every 100 points accumulated in the task), earning $2.60 on average (minimum = $1.65 and maximum = $4.85). No participants were excluded from analysis. The sample size was chosen based on the available budget.

Task and design

We used a repeated binary choice task commonly used in research on decisions from experience (Barron & Erev, 2003; Rakow & Miler, 2009). In this paradigm, participants make multiple choices between two buttons representing the gambles, a probability determines the outcome received after each option is selected, and immediate feedback is provided about the outcomes.

Participants were randomly assigned to one condition of a 3 × 2 between-participants design: a trend of probability (increasing, decreasing, or constant), and a type of feedback (partial or full). Participants were distributed about equally across the six conditions (N = 100 ± 1 for each). In each condition, participants completed 100 trials of the binary choice task. Both options were risky prospects with the same two possible outcomes (500 or 0 points), but one option was stationary (probabilities did not change over the course of 100 trials) and the other option was non-stationary (probabilities increased, decreased, or did not change, according to the experimental condition). In the stationary option, participants had a 0.5 probability of receiving the high-value outcome (500 points) and 0.5 probability of receiving the low-value outcome (0 points). In the non-stationary option, the probability of receiving the high-value outcome (500 points) increased, decreased, or stayed constant over the course of 100 trials (the probability of the low-value outcome, 0 points, was complementary).

Figure 1 illustrates the concept of partial and full feedback in an example of the increasing condition. Option A (stationary option) gives +500 points with 0.5 probability (0 otherwise), while Option B (non-stationary option) gives +500 with low probability initially but that probability increases over time. After each choice, participants may see the outcome of only their selected choice (partial feedback, Fig. 1, left panel) or may see the outcomes from both their selected choice and the forgone choice (full feedback, Fig. 1, right panel). Feedback refers to the realization of the outcomes and not to information about the gamble probabilities.

Fig. 1
figure 1

The repeated consequential choice paradigm of decisions from experience used in our study. The example illustrates a case of an increasing probability of obtaining the high outcome (+500 points) in the non-stationary option. The left panel illustrates partial feedback and the right panel illustrates full feedback

Figure 2 illustrates the trend in the probabilities of the high-value outcome in the three trend conditions (continuous line): increasing, decreasing, and constant. The figure also shows the probabilities of the stationary option (dashed line).

Fig. 2
figure 2

Probability of receiving the high (500 points) outcome for each gamble pair in the constant, increasing, and decreasing experimental conditions. The stationary option (represented by the red dashed line) obtained 500 points with 0.5 probability and 0 otherwise through the 100 trials. The probabilities and outcomes of the non-stationary option (represented by the blue continuous line) varied by experimental condition and trial (t). For the increasing condition: 500 points with 0.1*t probability, 0 otherwise. For the decreasing condition: 500 points with 1 - 0.1*t probability, 0 otherwise. For the constant condition, 500 points with 0.5 probability, 0 otherwise, equaling the stationary option

In the increasing condition, the probability of receiving 500 points in the non-stationary option started at .01, increased by .01 on each subsequent trial, and ended at 1 on the final trial, while 0 was obtained with the complementary probability. In the decreasing probability condition, the probability of receiving 500 points in the non-stationary option started at 1, decreased by .01 on each subsequent trial, and ended at .01 on the final trial, while 0 was obtained with the complementary probability. In the constant probability condition the probability of receiving 500 points in the non-stationary option was 0.5 for every trial, and 0 was obtained with 0.5 probability, matching exactly the stationary option in all the conditions. Both choice options in all conditions had an equivalent expected value over all 100 trials (EVstationary = EVnon-stationary = (.50 × 0) + (.50 × 500) = 250 points).

Although the stationary and non-stationary options in all conditions had equal overall expected values, for each trial, the relative benefit of the non-stationary over the stationary option depended upon the condition of the probability trend. In the increasing condition the non-stationary option had lower expected value than the stationary option during trials 1–49; it had equal expected value as the stationary option at trial 50; and it had higher expected value than the stationary option during trials 51–100. The decreasing condition had the reverse pattern. We define trial 50 as the switch point at which the stationary and non-stationary options switched their relative expected value. In the constant condition neither option had a higher expected value than the other on any trial.

In this task, adaptation is operationalized as the proportion of expected-value-maximizing choices in all trials after the expected value switch point. Additionally, per-trial analyses of choices after the switch point will allow us to also consider the speed of adaptation in our results.

Procedure

Participants first provided informed consent and answered demographic questions. They received instructions about the task according to their feedback condition (see Appendix 24), and then made 100 choices between the left and right buttons randomly assigned to the stationary option (labeled “A”) and the non-stationary option (labeled “B”). The left-right position of the non-stationary option was counterbalanced. After completing the task, participants filled out a debriefing questionnaire to ascertain their awareness of changes in the underlying probabilities: 1) “Overall, which option do you think was the best (gave you the most points on average)?” with answer choices of “A,” “B,” and “Neither”; 2) “Within each block of rounds below, what do you think the relationship was between the options?”. For five prompts that split the 100 trials into blocks of 20 trials (e.g., “Rounds 1–20,” “Rounds 21–40,” etc.), they selected answers from “A and B gave me the same number of points, on average,” “A gave me more points than B, on average,” or “B gave me more points than A, on average.”

The datasets for Experiments 1 and 2 are both available in the Open Science Framework repository.Footnote 1 The experiments were not pre-registered.

Results

Proportion of maximizing choices before and after the switch point

We investigated adaptation through the proportion of choices of the option with higher expected value (Max-rate) after the switch point, independent of whether an option was stationary or non-stationary. Figure 3 shows the overall and per-trial Max-rate before and after the switch point for the four experimental conditions. In the constant condition, neither choice option is better, therefore we removed this condition from these analyses.Footnote 2 Descriptively, these proportions suggest that participants are capable of selecting the better option before the switch point (i.e., Max-rate in all conditions is greater than 0.50), but have trouble selecting the better option after the switch point (i.e., Max-rates are at 0.51 and below on average). It is also possible to observe that before the switch point, the Max-rate is on average higher in the full feedback compared to the partial feedback condition in both the increasing and decreasing trends. After the switch point, however, full feedback does not seem to improve the Max-rate on average, as it seems affected more by trend; however, the speed of adaptation after the switch point may be affected by full feedback.

Fig. 3
figure 3

Overall maximizing choices before and after expected value (EV) change in options in Experiment 1, by condition. Mean (SD) max-rate provided in the labels. Black error bars represent standard error

To assess these descriptive observations, we fit a generalized logit modelFootnote 3 predicting the Max-rate by trend, feedback, trial, and before/after block, and by their two-way, three-way, and four-way interactions. This model was a significant improvement over the null (χ2(15) = 3890.1, p < .001). The full set of regression coefficients (providing effect sizes) are reported in Appendix 25, Table 4. The analyses revealed a significant main effect of feedback χ2(1) = 67.58, p < 0.001, where the Max-rate was higher with full feedback (Mfull = 0.121 log-odds) than with partial feedback (Mpartial = 0.033 log-odds). There was also a significant main effect of trend χ2(1) = 23.24, p < 0.001, where the Max-rate was higher for the increasing condition (Mincreasing = 0.118 log-odds) than the decreasing condition (Mdecreasing = 0.035 log-odds). The main effect of block was significant, χ2(1) = 2043.08, p < 0.001, where the Max-rate was higher before the switch point (Mbefore = 1.010 log odds) than after the switch point (Mafter = -0.860 log odds). The main effect of trial was also significant, χ2(1) = 570.48, p < 0.001.

The two-way interaction of trend and block was significant χ2(1) = 73.82, p < 0.001, indicating an effect of trend on adaptation. Before the switch point, the Max-rate was higher in the increasing condition (Mbefore-inc=1.251 log odds) than in the decreasing condition (Mbefore-dec = 0.777 log odds). After the switch point, the Max-rate was instead higher in the decreasing condition (Mafter-dec = -0.706 log odds) than the increasing condition (Mafter-inc = -1.015 log odds). Additionally, the two-way interaction of feedback and block was significant χ2(1) = 117.32, p < 0.001. Before the switch point, the Max-rate was higher with full feedback (Mbefore-full = 1.286 log odds) than with partial feedback (Mbefore-partial = 0.741 log odds), while after the switch point the Max-rate was higher with partial feedback (Mafter-partial = -0.676 log odds) than with full feedback (Mafter-full = -1.045 log odds). The interaction of trend and feedback was not significant (p > 0.05). The interaction of trend and trial was significant, χ2(1) = 19.31, p < 0.001. The interaction of feedback and trial was significant, χ2(1) = 17.10, p < 0.001. The interaction of block and trial was significant, χ2(1) = 302.51, p < 0.001. Looking at these estimates for the effects of trial, we can see the speed of adaptation after the switch interacts with both feedback and trend: participants in the increasing condition adapt equally quickly with partial and full feedback, while participants in the decreasing, partial feedback condition adapt more slowly than those in the increasing conditions, and participants in the decreasing, full feedback condition adapt more quickly than those in the increasing condition.

The three-way interaction of trend, feedback, and block was also significant, χ2(1) = 16.08, p < 0.001. The interaction of trend, feedback, and trial was significant, χ2(1) = 38.93, p < 0.001. The interaction of trend, block, and trial was significant, χ2(1) =15.45, p < 0.001. The interaction of feedback, block, and trial was significant, χ2(1) = 10.27, p < 0.001. The four-way interaction between trend, feedback, block, and trial was not significant (p > 0.05).

Overall, these results indicate that adaptation to change is affected by trend and feedback. On average, full feedback helps participants choose the better option before the switch point; however, on average, it does not help after the switch point. On average, after the switch point, we observe more maximization in the decreasing condition than in the increasing condition. The speed of adaptation, however, does appear to be increased by the presence of full feedback, and the speed of adaptation also interacts with trend, such that decreasing, partial conditions show the slowest adaptation speed, and decreasing, full conditions show the highest.

Individual-level analyses of adaptation to change

To assess successful adaptation to change at the individual level, we examined the average Max-rate before and after the switch point per participant in each of the four changing conditions. Each dot in Fig. 4 illustrates each participant’s average Max-rate in trials before and after the switch point. The dashed lines separate the participants into four quadrants. Along the X-axis, participants were split into groups based on their Max-rate before the switch point (either a Max-rate ≤ 0.5 or a Max-rate > 0.5). Along the Y-axis, participants were split into groups based on their Max-rate after the switch point (either a Max-rate ≤ 0.5 or a Max-rate > 0.50).

Fig. 4
figure 4

Max-rate averages for individual participants before and after the switch point. The dashed lines separate the participants into four quadrants that denote proportion of maximizing choices before and after the switch point (discussed in more detail in the text)

For each observation, we used the quadrant to classify participants into four types of choice behavior. “Fortunate” respondents were participants who most often selected the option with lower expected value before the switch point and did not change after the switch point (luckily allowing them to maximize after the switch point). “Agile” respondents selected the best option both before and after the switch point, requiring a shift in preferred options to adapt to the change at the switch point. “Clumsy” respondents selected the lower expected value option before the switch point and then shifted to continue selecting the lower expected value option after the switch point; and “Rigid” respondents selected the maximizing option before the switch point and continued to prefer this same option after the switch point, unfortunately selecting the poor option after the change. In other words, Agile and Clumsy participants explicitly adapted to change, and Fortunate and Rigid participants did not.

Figure 4 provides exact percentages for each experimental condition, while Table 1 shows the percentage of participants of each type by averaged over trend levels, averaged over feedback levels, and the differences between the feedback and trend levels (respectively). A Pearson’s χ2 test of independence between experimental condition and the proportion of participants in each of the four maximization behavior categories was non-significant (χ2(9) = 11.68, p = 0.23), indicating that the distribution of participants among these categories did not differ significantly between the four experimental conditions. Tables 6 and 7 in Appendix B provide the observed counts, expected counts, and standardized residuals for this analysis. Appendix B, Fig. 13 simulates how random responses would appear in this type of analysis.

Table 1 Percent of participants displaying each pattern for Individual Max-rate, Experiment 1

Given the lack of evidence for an effect of experimental condition on behavior category, we consider the general patterns of behavior across experimental conditions. Descriptively, there are more non-adaptive than adaptive individuals in all conditions. Most of the adaptive participants are Agile (38%), not Clumsy (3.5%), and most of the non-adaptive participants are Rigid (48%), not Fortunate (10.5%). These results suggest general difficulty in adapting to change, and individual differences playing a role in whether decision makers adapted or did not. Especially consistent with stickiness, we see plenty of individual rigidity where participants failed to switch to the alternate option after the switch point, rather than individual fortune in selecting the worse option initially and continuing to select it as it became the better option.

Beliefs of maximization

Here we analyzed the way participants answered questions indicating their belief regarding the relative value of the two options (stationary and non-stationary). Figure 5 shows the distributions of beliefs about overall value for the different conditions. As observed, very few participants (less than 50% in all conditions) had the correct belief that the two options were of equal value overall (stationary = non-stationary), even in the constant condition. Thus, a large majority of participants thought (incorrectly) that the two options were different in value overall. As observed in Fig. 5, there was a general tendency to believe that the initially better option had the greater overall value (i.e., percent of stationary > non-stationary in the increasing condition and non-stationary > stationary in the decreasing condition), suggesting a primacy effect.

Fig. 5
figure 5

Beliefs about the value of the stationary (S) and non-stationary (NS) options over all trials in Experiment 1, by condition. After completing all choice trials, participants in each condition indicated whether they believed that the (unlabeled) stationary option would provide more points overall (S > NS), that the non-stationary option would provide more points overall (NS > S), or that both options were equal on average (S = NS). Appendix B, Fig. 14 provides participant beliefs given their actual, experienced expected value

A Pearson’s χ2 test of independence between experimental condition and overall belief suggests that the distribution of overall belief varies significantly by experimental condition (χ2 (10) = 116.68, p < 0.001). Follow-up analyses use standardized Pearson residuals, and these residuals identify significant deviations from the expected cell counts: the cell counts expected if the distribution of beliefs was independent of experimental condition (Agresti, 2002). These analyses suggest significant cell count deviations for participants who (incorrectly) believed the options were unequal. More often than expected, these participants believed that the initially better option had the greater overall value. Beliefs that the S option was better than the NS option occurred less frequently than expected in both partial and full feedback, decreasing conditions, and more frequently than expected in both partial and full feedback, increasing conditions. Beliefs that the NS option was better than the S option occurred more frequently than expected in both partial and full feedback, decreasing conditions and less frequently than expected in both partial and full feedback, increasing conditions. In both partial and full feedback, constant conditions, beliefs that one option was better than the other did not occur significantly more or less frequently than expected if beliefs and condition were independent. Tables 10 and 11 in Appendix B provide the observed counts, expected counts, and standardized residuals for this analysis. Given the probabilistic nature of the task, an additional analysis is provided in Fig. 14 in the Appendix. This analysis takes into account the outcomes that participants actually experienced, which suggests that the beliefs of participants in partial feedback conditions may somewhat successfully track their experienced outcomes, but that participants in full feedback conditions still too-frequently believe that the initially better option was the better option overall.

Experiment 1: Discussion

The results of Experiment 1 reveal the difficulty of adapting to the continuous change of probabilities in a simple, binary choice task. With both a main effect of trend and a trend and block interaction: on average, participants in the decreasing condition adapted more successfully to the change (after the switch point) than participants in the increasing condition, with further differences in the speed of adaptation influenced by trend and feedback. Also, on average, participants receiving full feedback made more maximizing choices before the switch point than participants receiving partial feedback, but feedback (on average) did not help after the switch point, although it did increase the speed of adaptation in the decreasing condition after the switch. This provides evidence for the existence of an asymmetry in adaptation based on the trend or direction of change and feedback.

Analysis of individual strategies and the post-survey beliefs about the expected values of the two options suggest “rigidity” as the most common problem with adaptation, where participants continue to choose the same option after the switch point, and where their beliefs suggest that the initially better option was considered the best option overall. Despite full feedback, it is possible that the change was very hard to detect because probability is not explicitly observed, but rather participants must infer a change in probability from the frequency of experiencing the high outcome. It is also possible that participants expected the initially better option to be the better option overall, and thus paid little attention to the frequency of experiencing the high outcome.

The question we address in Experiment 2 is whether making the object of change directly observable would improve the ability of participants to adapt to the environmental changes. In Experiment 2, the outcome value itself is the changing feature in the decision problem. We expect to see overall better adaptation to change when the object of change is made directly observable to the participants.

Experiment 2: Detection of outcome change

Method

Participants

We tested 603 participants (Mage = 34.2, SDage = 10.6 years; 261 female), recruited from Amazon Mechanical Turk. Participants were compensated solely based on their performance in the choice task (1 cent for every 100 points accumulated in the task), earning $2.59 on average, (minimum = $1.76 and maximum = $4.62). No participants were excluded from analysis. The sample size was chosen based on the available budget.

Task and design

The design of this experiment was exactly the same as that of Experiment 1, except that the element of change in this experiment was the outcome (the observable element of the gamble), instead of the probability (the unobservable element of the gamble). Participants were distributed about equally across conditions: increasing, full feedback N = 98; increasing, partial feedback N = 101; decreasing, full feedback N = 101, decreasing, partial feedback N = 102; constant, full feedback N = 99, and constant partial feedback N = 102.

To keep the gambles of this experiment equivalent to those in Experiment 1, both options were risky prospects with a low-value outcome of 0 points and a high-value outcome of up to 1,000 points, and one option was stationary (the outcome values did not change over the course of 100 trials) and the other option was non-stationary (the outcome values increased, decreased, or stayed constant, according to the experimental condition). For all conditions, the stationary option was the same as in Experiment 1, offering a high outcome (500 points) and a low outcome (0 points) with 0.5 probabilities. In the increasing condition, the non-stationary option’s high-value outcome increased from 10 to 1,000, with ten additional points in each subsequent trial. In the decreasing condition, the non-stationary option’s high-value outcome decreased from 1,000 to 10, with ten less points in each subsequent trial. In the constant condition the non-stationary option’s high-value outcome stayed constant at 500 over the course of 100 trials. Figure 6 illustrates examples of partial and full feedback with changing outcome values, and Fig. 7 illustrates these values in each of the three trend conditions.

Fig. 6
figure 6

The repeated consequential choice paradigm of decisions from experience as implemented in Experiment 2. The example illustrates a case of increasing values of the high-value outcome in the non-stationary option (Option B). The left panel illustrates partial feedback and the right panel illustrates full feedback

Fig. 7
figure 7

Value of the high outcome for each choice option in the constant, increasing, and decreasing experimental conditions. The stationary option (represented by the red dashed line) obtained 500 points with 0.5 probability and 0 otherwise through the 100 trials. The high-value outcome of the non-stationary option (represented by the blue continuous line) varied by experimental condition and trial (t). For the increasing condition: 10*t points, 0 otherwise. For the decreasing condition: 1,000 – 10*t points, 0 otherwise. For the constant condition, 500 points with 0.5 probability, 0 otherwise, equaling the stationary option

Thus, Experiment 2 was objectively equivalent to Experiment 1 in terms of expected values; the relative benefits of the non-stationary over the stationary option were also equivalent to those in Experiment 1. In the increasing condition, the non-stationary option was worse in expected value than the stationary option before the switch point (trial 50), and better than the stationary option after the switch point. In the decreasing condition the non-stationary option was better than the stationary option before the switch point and worse than the stationary option after the switch point. The relative benefit of the two options was the same before and after the switch point in the constant condition.

The procedures were identical to Experiment 1.

Results

Given the similarities between experiments, we conducted the same analyses as in Experiment 1. In what follows, we summarize the results for the proportion of maximizing choices,Footnote 4 the individual-level analyses of maximization behavior and the analyses of beliefs. Supplemental results from the statistical analyses conducted can be found in Appendix 25.

As in Experiment 1, we analyzed the proportion of choices of the best option (Max-rate) in the increasing and decreasing conditions both per-trial and before and after the switch point (see Fig. 8). We observed multiple patterns similar to Experiment 1. Participants are generally capable of selecting the best option before the switch point (i.e., Max-rate in all conditions is greater than 0.50). However, in most conditions (expect for the decreasing with partial feedback condition) they have trouble adapting their choices after the switch point, even when the outcomes were directly observed and even with full feedback.

Fig. 8
figure 8

Overall maximizing choices before and after the switch point in options in Experiment 2, by condition. Mean (SD) max-rate provided in the labels. Error bars represent 95% confidence intervals

We fit a generalized logit model predicting the Max-rate by trend, feedback, trial, before/after block, and their interactions.Footnote 5 This model was a significant improvement over the null (χ2(15) = 2216.9, p < .001). The full set of regression coefficients (providing effect sizes) are provided in Appendix B, Table 5. The analyses revealed a significant main effect of feedback χ2(1) = 357.25, p < 0.001, where the Max-rate was higher with full feedback (Mfull = 0.434 log odds) than with partial feedback (Mpartial = 0.134 log odds). There was also a significant main effect of trend χ2(1) =11.11, p < 0.001, where the Max-rate was higher for increasing (Mincreasing = 0.317 log odds) than for decreasing (Mdecreasing = 0.250 log odds). The main effect of block was significant, χ2(1) = 990.28, p < 0.001, where the Max-rate was higher before the switch point (Mbefore = 0.948 log odds) than after the switch point (Mafter = -0.380). The main effect of trial was also significant, χ2(1) = 548.70, p < 0.001.

The two-way interaction of trend and block was significant χ2(1) = 51.88, p < 0.001, indicating an effect of trend on adaptation. Before the switch point, the Max-rate was higher in the increasing condition (Mbefore-inc = 1.130 log odds) than in the decreasing condition (Mbefore-dec = 0.766 log odds). After the switch point, the Max-rate was instead higher in the decreasing condition (Mafter-dec = -0.266 log odds) than the increasing condition (Mafter-inc = -0.495). Additionally, the two-way interaction of feedback and block was significant χ2(1) = 54.98, p < 0.001. Before the switch point, the Max-rate was higher with full feedback (Mbefore-full = 1.243 log odds) than with partial feedback (Mbefore-partial = 0.652 log odds), while after the switch point Max-rates were comparable (Mafter-full = -0.376 log odds, Mafter-partial = -0.385 log odds). The interaction of trend and feedback was also significant χ2(1) = 101.91, p < 0.001. In the increasing condition, full feedback resulted in a higher Max-rate (Mfull-inc = 0.478 log odds, Mpartial-inc = 0.157 log odds) than in the decreasing condition (Mfull-dec = 0.389 log odds, Mpartial-dec = 0.110 log odds). The interaction of trend and trial was significant, χ2(1) = 5.99, p = 0.01. The interaction of feedback and trial was significant, χ2(1) = 28.02, p < 0.001. The interaction of block and trial was significant, χ2(1) = 73.27, p < 0.001. Looking at the estimates for the effects of trial, we can see the speed of adaptation after the switch interacts with both feedback and trend. Participants with full feedback adapt more quickly than participants with partial feedback. Participants in the decreasing conditions adapt more quickly than those in increasing conditions; however, this effect of trend is smaller in the full feedback condition that the partial feedback condition, indicating the helpful effects of feedback appear to be in the speed of adaptation.

The three-way interaction of trend, feedback, and block was also significant, χ2(1) = 41.34, p < 0.001. The interaction of trend, feedback, and trial was not significant, p > 0.05. The interaction of trend, block, and trial was significant, χ2(1) =11.65, p < 0.001. The interaction of feedback, block, and trial was significant, χ2(1) = 6.49, p = 0.01. The four-way interaction between trend, feedback, block, and trial was significant, χ2(1) = 22.09, p < 0.001.

These results suggest that, like in Experiment 1, many participants successfully maximized before the switch point, but on average most had difficulty adapting to the change, with the exception of participants in the decreasing, partial condition. On average, participants in the decreasing conditions showed comparable (full feedback) or better (partial feedback) adaptation to the change than participants in the increasing conditions. Once again, the speed of adaptation interacts with both feedback and trend, with full feedback increasing adaptation speed, and a slight improvement of speed due for decreasing trend compared to increasing trend.

Individual-level analyses of adaptation to change

As in Experiment 1, we investigate adaptation to change at the individual level using the Max-rate before and after the switch point per participant in each of the four experimental conditions. Again, the quadrants show a classification of the participants into: Fortunate, Agile, Clumsy, and Rigid choice patterns, where Agile and Clumsy are the participants that adapted to change by explicitly switching their choices to the most rewarding option after the switch point, while Fortunate and Rigid participants did not. Figure 9 shows the individual Max-rates by condition and indicates the observations for each quadrant (maximization behavior pattern). Table 2 shows the percent of participants of each type by experimental condition.

Fig. 9
figure 9

Max-rate averages for individual participants before and after the switch point. The dashed lines separate the participants into four quadrants that denote proportion of maximizing choices before and after the switch point (discussed in more detail in the text)

Table 2 Percent of participants displaying each choice pattern for Individual Max-rate, Experiment 2

A Pearson’s χ2 test of independence between experimental condition and category of maximization behavior suggests the distribution of participants among the four categories of maximization behavior varies significantly with the experimental condition (χ2(9) = 52.38, p < 0.001). The residuals indicate that there were fewer Agile participants and more Rigid participants than expected in the increasing, partial feedback condition, and also fewer Fortunate participants than expected in the increasing, full feedback condition – consistent with aggregate-level observations that the increasing, partial feedback condition is associated with more adaptation difficulty. Also consistent with aggregate-level observations of better adaptation in the decreasing condition, the residuals indicate that there were more Fortunate participants and fewer Rigid participants than expected in the decreasing, partial feedback condition. Tables 8 and 9 in Appendix B provide the complete set of observed counts, expected counts, and standardized residuals. Thus we see more adaptive participants in the decreasing (49%) than increasing (43%) conditions, and more adaptive participants in the full feedback (57%) than the partial (36%) feedback conditions. The number of adaptive and non-adaptive individuals appears to depend on both the trend and the feedback. As in Experiment 1, descriptively over all conditions, most adaptive participants were Agile (40%), not Clumsy (7%); and most non-adaptive participants were Rigid (36%), not Fortunate (17%). These results suggest difficulty to adapt to change, slightly more so in the increasing than the decreasing condition; with greater difficulty to adapt to change for partial than for full feedback. The lack of adaptation is, again, likely due to individual rigidity, where participants failed to switch to the alternate option after the switch point, and not because they were fortunate (selecting the worse option initially and continuing to select the initially worse option after the switch point as it became the better option).

Beliefs of maximization

Figure 10 shows the distributions of beliefs about the overall values of the two options, for each of the different experimental conditions. As observed in Experiment 1, very few participants (less than 50% in all conditions) had the correct belief that the two options were of equal value overall, even in the constant condition. A Pearson’s χ2 test of independence between experimental condition and overall belief suggests that the distribution of overall belief varies significantly by experimental condition (χ2 (10) = 41.67, p < 0.001). Follow-up analyses use standardized Pearson residuals to identify significant departures from the expected counts (expected if the distribution of beliefs was independent of experimental condition; Agresti, 2002). These analyses find beliefs that the stationary option was better than the non-stationary option more often than expected for the partial feedback, increasing condition, and a trend in this direction for the full feedback, increasing condition. In contrast, beliefs that the stationary option was better occurred less often than expected in the full feedback, decreasing condition, and trended in this direction for the partial feedback, constant condition. Beliefs that the non-stationary option was better than the stationary option occurred more frequently than expected in the full feedback, decreasing condition. Additionally, we see a departure from the patterns of Experiment 1 for the decreasing, partial feedback condition: the proportion of beliefs that the stationary option is better than the nonstationary option and the proportion of beliefs that the nonstationary option is better than the stationary option were not significantly different from expected (the null hypothesis that there was no effect of condition on the distribution of beliefs). Descriptively, we see a hint of a recency effect, with slightly more frequent beliefs that the stationary option (the initially worse option) was better overall than beliefs that the nonstationary was better than the stationary option overall (though the frequencies of both beliefs are similar). Finally, beliefs that the two options were equal in expected value trended in the direction of less than expected in the partial feedback, increasing condition and trended in the direction of more than expected in the full feedback, constant condition. Tables 12 and 13 in Appendix B provide the observed counts, expected counts, and standardized residuals for this analysis. Given the probabilistic nature of the task, an additional analysis is provided in Fig. 15 in the Appendix. This analysis takes into account the outcomes that participants actually experienced and, like in Experiment 1, suggests participants in the partial feedback conditions may somewhat successfully track their experienced outcomes, but that participants in full feedback conditions still too-frequently believe that the initially better option was the better option overall.

Fig. 10
figure 10

Beliefs about the value of the stationary (S) and non-stationary (NS) options over all trials in Experiment 2, by condition. After completing all choice trials, participants in each condition indicated whether they believed that the (unlabeled) stationary option would provide more points overall (S > NS), that the non-stationary option would provide more points overall (NS > S), or that both options were equal on average (S = NS). Appendix B, Fig. 15 provides an analysis of beliefs given the experienced expected value

Cross-experiment comparison

Table 3 summarizes the results of the Max-rate across both experiments. First, in both experiments there was a main effect of feedback, such that full feedback leads to significantly more maximizing choice than partial feedback; and a significant main effect of block, where Max-rate is higher before the switch point (Block 1) than after it (Block 2), clearly exposing the general difficulty of adaptation to change. Second, in both experiments there was a consistent significant interaction between trend and block, such that there was better adaptation on average (maximization over all trials after the switch point) in the decreasing than the increasing trend.

Table 3 Summary of experimental condition effects on Max-rate for Experiments 1 and 2

The main differences in Max-rate between the experiments are in several interactions. First, Experiment 2 finds a significant interaction between feedback and trend conditions, while Experiment 1 did not. In Experiment 2, the maximization rate was higher for full feedback than for partial in both increasing and decreasing conditions; however, the difference between full and partial was greater in the increasing condition than in the decreasing condition. Second, Experiment 1 found a significant interaction between Feedback, Trend, and Trial, while Experiment 2 did not. Third, Experiment 2 found a significant interaction between Feedback, Trend, Block, and Trial, while Experiment 1 did not. Additionally, Experiments 1 and 2 both found a significant interaction between feedback and block. However, in Experiment 1 this interaction was a crossover, as before the switch point, the Max-rate was higher with full feedback than with partial feedback, while the after the switch point Max-rate was higher with partial feedback than with full feedback. In contrast, in Experiment 2, before the switch point, the Max-rate was higher with full feedback than with partial feedback, while after the switch point Max-rates were comparable.

General discussion

We found a robust effect of trend on adaptation to change. Participants selected the maximizing option more often after the switch point when the non-stationary option decreased in value over time than when it increased in value, showing trend’s influence on adaptation. This trend effect in adaptation is consistent with the information asymmetry created by a stickiness effect: participants more often selected the choice option providing higher rewards early on, suggesting the increasing and decreasing conditions differed in whether information about the change (the non-stationary option) was available or salient. We offer two factors that moderate the trend effect: (a) the observability of the forgone payoffs (full vs. partial feedback) and (b) the observability of the element of change (i.e., the outcomes vs. the unobservable probabilities of the outcomes).

First, a trend effect for adaptation with partial feedback may be explained by the same mechanisms driving the “hot stove effect” (Denrell & March, 2001; Rich & Gureckis, 2019) and choice patterns seen in previous dynamic decision-making studies (Cheyette et al., 2016; Konstantinidis, Harman, & Gonzalez, 2022). Specifically, in partial feedback conditions, participants only experience the outcomes of the choice options they select. Therefore, participants may acquire more information about an option that is initially favorable than about an option that is initially unfavorable. In the case of the increasing condition – where the non-stationary option is initially unfavorable – many participants selected the favorable, stationary option more often, limiting their ability to learn about the improving value of the non-stationary option.

Similar factors may occur even with full feedback. We find that full feedback is essential for selecting the best option in early trials, but it is not enough for successful adaptation. That is, maximization frequency was higher with full feedback than with partial feedback but only before the switch in the relative expected value of the options. After the switch, full feedback only influenced speed of adaptation. This suggests that full feedback must be considered carefully as a method to help individuals successfully adapt to change. Depending on the number of decisions (or amount of time) after a change, a decision maker with full feedback may not be quick enough to counter the “disadvantage” provided by entering a changed environment with a strong preference for a previously better option, even though we would expect more information to improve a participant’s ability to detect and adapt to change.

These results could be explained by individual differences in memory weight or attention given to the payoff of the selected option versus the forgone payoff, as suggested in previous work (Lejarraga et al., 2014; Rakow & Miler, 2009). If one weighs the outcomes of the option selected more than the outcomes of the option not selected, this could also lead to stickiness in the selection of the initially more favorable option and to an inability to adapt when these outcomes change, even when feedback about both options is provided. Regarding attention, Ashby and Rakow (2016), using eye tracking, found that participants may fail to attend to forgone more than obtained outcomes, especially after a long sequence of trials. Thus, a full feedback condition may have the same effects as the partial feedback condition as the task progresses, explaining why people might fail to identify changes that occur after the switch.

Second, our results also suggest that stickiness effects depend on the direct observability of the element of change. More participants were classified as “agile” (respondents who, on average, selected the best option both before and after the switch point) in Experiment 2, when the outcome rather than the probability changed, and with full feedback (from 21.8% to 50% in the Increasing condition, and from 38.2% to 48.5% in the Decreasing condition). In contrast, in Experiment 1 full feedback increased the proportion of individuals classified as “rigid” (respondents who, on average, selected the maximizing option before the switch point and continued to prefer this same option after the switch point; from 49.5% to 57% in the increasing condition and from 40.6% to 44.4% in the decreasing condition). Thus, the main conclusion is that full feedback facilitates adaptation to change only when the object of change is fully observable.

In summary, we found evidence that decision makers are more successful at adapting to the decreasing expected value of an option than the increasing expected value, with or without counterfactual feedback about outcomes, and whether the element of change is indirectly or directly observable. These findings are consistent with explanations of over-reliance (through choice selection, memory, attention, or weighting processes) on initial experiences versus later (recent) experiences. These findings also suggest the importance of memory-based computational theories of dynamic decision making that can help clarify how individual decision makers treat early and later experiences when forming an understanding of dynamic environments, such as whether early versus later experiences are more salient, are given more weight or more attention in decision making (Konstantinidis et al., 2022). Such mechanisms may interact with different patterns of change to result in better or worse adaptation. While our results provide an initial step to elucidate the dynamics of the memory balances between stickiness and recency effects, there is substantial need to advance our understanding of these dynamic effects in individuals’ (and groups’ and organizations’) abilities to successfully adapt to changing conditions in the environment. Furthermore, our research studied a pattern of change that was immediate, linear, and continuous; but to assess the generality and robustness of our findings, it is essential to systematically investigate the effects of other dynamic patterns of change that would give rise to the general effect of over-reliance on initial or more recent experiences.