## Abstract

Making successful decisions in dynamic environments requires that we adapt our actions to the changing environmental conditions. Past research has found that people are slow to adapt their choices when faced with change, they tend to be over-reliant on initial experiences, and they are susceptible to factors such as feedback and the direction of change (trend). We build on these findings using two experiments that manipulate feedback and trend in a binary choice task, where decisions are made from experience. Feedback was either partial (providing only the outcome of the selected choice) or full (providing outcomes of the selected and the forgone choice) and the expected value of one option either increased, decreased, or remained constant. Crucially, although the two choice options had equal expected value averaged across all trials, their expected values on individual trials differed, and halfway through 100 choice trials the choice option with higher expected value switched, requiring participants to adapt their choices in order to maximize their outcomes. In Experiment 1, the *probability* of receiving the high-value outcome changed over time. In Experiment 2, the *outcome value* changed over time. Generally, we found that participants had trouble adapting to change: full feedback led to more maximization than partial feedback before the switch but did not make a difference after the switch, suggesting stickiness and poor adaptation. Slightly better adaptation was found for changing outcome values over changing probabilities, implying that the observability of the element of change influences adaptation.

### Similar content being viewed by others

Avoid common mistakes on your manuscript.

## Introduction

Our ability to detect changing situations and our capacity to adapt our decisions accordingly may determine the success of the choices we make. Choice options that were initially rewarding might become less rewarding in the future, or vice versa; and safer alternatives might become riskier over time, while riskier options might become safer. For example, stock market traders must successfully detect changes in the market to stay on top of the best prices. Traders must track and adapt their investment decisions according to the available assets, their expected probability of success, and the stock values. All of these aspects may result in the increase or decrease of stock values over time, making some investments potentially more profitable than others.

However, it is difficult to detect changing situations because often we do not observe the information that would signal a change in the environment and we often rely on our own experience, i.e., the outcomes of similar decisions we have made in the past (Behrens, Woolrich, Walton, & Rushworth, 2007; Gonzalez et al., 2003). To adapt choices to changing environments, individuals need to be able to track the statistical structure of the environment and be able to adapt their learning accordingly (Behrens et al., 2007). For example, the stock market investor must track the changing signals in the market in order to determine the most profitable investment portfolios. With reliance on experience, there are at least two aspects that will influence the way people learn and adapt their choices to changes in the environment: the feedback from the decisions made will signal whether a decision resulted or could have resulted in a good outcome; and the direction (i.e., trend) in which a change occurs will determine whether the environment is improving or worsening. For example, the past trend of a market, the outcome of past investment decisions, and whether a stock’s value is expected to increase or decrease over time will influence the trader’s choices.

Past research in behavioral decision research (discussed below), suggests that humans adapt their choices slowly to volatile environments. Researchers have used controlled choice environments, usually repeated binary choices, with full feedback (where the outcomes of the chosen and unchosen option are revealed) or partial feedback (where the outcomes of only the chosen option is revealed); and they have also included diverse patterns of change of the value of the choice options over time. Current results point to “stickiness” (i.e., over-reliance on initial experiences) as an inhibitor to adaptation, which could lead to low awareness depending on the kind of feedback and changes in the trend patterns. For example, a trader may adapt more successfully to market changes when an investment was initially better but worsens over time compared to changes in an initially worse but improving investment, because the trader was not investing in the worse option in the first place. This *trend effect* may be influenced by the feedback and the outcomes that are actually observed as decisions are made in a volatile environment.

In this paper, we advance current research on choice adaptation by clarifying the trend effects according to different types of feedback and information that signals a change in the environment. Specifically, we look at the performance and individual strategies before and after the options change while learning in a binary choice task. We determine how adaptation strategies under different trends may be influenced by the type of feedback (full or partial) and by a direct observation of the information that changes (probabilities or outcomes). Our results from two experiments help to replicate and advance the effects of the trend, feedback, and observability of outcomes in dynamic environments, improving our knowledge of the conditions that influence adaptation of choice.

### Trend effects and adaptation to change

The trend of change in the environment (whether a decision option was initially better but worsens over time or whether it is initially worse but improves over time) has been studied in binary choice tasks where one of the options is stationary and another one is non-stationary (Rakow & Miler, 2009). In their experimental paradigm, the probability of receiving a high outcome in a non-stationary option changed over multiple trials using various trends. Their results suggest that participants adapt slowly to changes in the environment mostly due to *stickiness* in which initial experiences impact later choices and may inhibit adaptation to changes in the probabilities. Participants overall preferred the option that was initially higher in expected value, inhibiting their adaptation in later choices. They used multiple games of repeated binary choice, where the non-stationary choice’s probability varied in its pattern of change over time. For some patterns, the probability changed from stable to increasing to stable again, and for other patterns, the probability changed from stable to decreasing to stable again. Thus, for increasing probability patterns, the non-stationary option was initially lower in expected value than the stationary option but then became higher. For decreasing probability patterns, the non-stationary option was initially higher in expected value than the stationary option but became lower. Rakow and Miler’s findings suggested an effect of trend on adaptation. Specifically, their first experiment revealed fewer switches away from a decreasing probability option than toward an increasing one. However, Rakow and Miler’s experimental design did not manipulate the trend alone, and their observation of stickiness did not replicate in their own experiments. Thus, the authors carefully proposed (but did not conclude) that decision makers better adapted to an improving (initially unfavorable option that improves over time) than to a worsening (initially favorable option that worsens over time) trend of change. The authors left the study of the trend and its cognitive explanation open for future research.

Recent studies have corroborated and extended the initial observations from Rakow and Miler. Rakow and Miler’s discussion of *recency* in non-stationary environments has also received subsequent attention. Recency is an over-reliance on one’s most recent experiences, and studies have pointed concretely to recency as a key to facilitate the adaptation of choice in changing conditions of the environment. For example, Lejarraga et al. (2014) suggested that recency may drive better adaptation of choices made by individuals (compared to groups). They found that individuals adapted to change more successfully than groups in gamble trends with decreasing probability of the best outcome, but not in one with increasing probability. Their argument was that individuals, having “poorer” memory than groups, rely more on recent than earlier experiences, which leads them to adapt better to change than groups do. However, this effect was not robust to all increasing trends, leading Lejarraga et al. (2014) to leave the test of the trend effect for future research.

In an attempt to clarify the effect of trend on adaptation to change, and the role of stickiness and recency, researchers have directly manipulated the trend in experiments using a binary choice task similar to that of Rakow and Miler, where the probabilities of obtaining a high outcome in a non-stationary option change continuously over time (e.g., linearly increased or decreased; see Cheyette et al., 2016; Konstantinidis, Harman, & Gonzalez, 2022). Current results indicate slower choice adaptation to increasing probabilities than to decreasing probabilities, and an emerging preference for a sure option when the alternatives are stationary. Their results are also supported by parameters derived from cognitive models of experiential choice (Cheyette et al., 2016; Gonzalez et al., 2003; Konstantinidis et al., 2022). Konstantinidis and colleagues fit models to individual participant data, and observed that individual differences in recency-of-memory parameters were associated with successful adaptation. These findings suggest that individuals who were more forgetful of past experiences were better at adapting to changing environments, in contrast to the face valid argument that more information about the history of events helps decision makers adapt better. We extend these recent investigations by determining how the trend effect may be influenced by the type of outcome feedback and by the observability of the element of change.

### Feedback effects and observability of outcomes

Feedback is an important aspect of learning and choice behavior in sequential decision tasks (Erev & Barron, 2005; Gonzalez et al., 2003). Specifically, environments with partial feedback force decision makers to choose between *exploring* the outcomes of choice options to obtain more useful information and *exploiting* choice options based on already known information (Camilleri & Newell, 2011; Yechiam & Rakow, 2012). In contrast, full feedback removes the need for choice-based exploration, allowing decision makers to focus on exploiting the available choice options, and also to better adapt to harsh environments (Rakow et al., 2015). These behavioral patterns are also supported by neuroscientific evidence: there are strong correlations between the knowledge of forgone outcomes (i.e., full feedback) and brain activity in areas involved with valuation and choice (Lohrenz et al., 2007). Thus, we would expect that choice adaptation would depend on the type of feedback provided.

Rakow and Miler (2009) used full feedback, and as discussed above, they found evidence for a potential effect of trend on adaptation, explained partially with the stickiness effect. In contrast, recent investigations used only partial feedback paradigms (Cheyette et al., 2016) and found an effect of the trend on adaptation explained by the recency effect.

Other related studies have directly compared the effects of partial and full feedback in volatile environments, but not the effects of the changing trend. Avrahami et al. (2016) found differences in choice behavior when full versus partial feedback was provided in a task where the expected values of two choice options flipped multiple times. They found that, when full feedback was provided, participants showed a greater preference for the riskier of the two choice options, and their choices were influenced by the outcome of the riskier choice. In comparison, when partial feedback was provided, participants’ choices were only influenced by outcomes when their current choice obtained a low outcome and the last outcome of the forgone option was a high outcome.

More generally, the effects of the type of feedback may be directly linked to the observability of the change itself. For example, recent research has found that participants who are more sensitive to probabilities and outcome values can better respond to changes in these features (Wulff, Mergenthaler-Canseco, & Hertwig, 2018). Also, Ashby and Rakow (2016) found that, with long sequences of choices, participants often fail to observe some of the outcomes. In particular, participants fail to observe the forgone outcomes more often than the obtained outcomes and more as the task progresses. These results suggest that participants’ ability or propensity to observe the change will influence the way they identify changes in the task, particularly when changes occur later in the decision task.

In binary choice tasks, changes in underlying probabilities are only inferred from the frequency with which various outcomes occur. This sets an upper limit on how accurately participants can adapt to changes in probabilities, even if participants pay full attention to the information provided. Research has found that the volatility of a reward environment influences the way individuals adjust their choices (Behrens et al., 2007), and when there is variability to the sensitivity to probability values, most participants making decisions from experience are least sensitive to differences in moderate rather than high probabilities (between roughly 0.4 and 0.6) (Kellen, Pachur, & Hertwig, 2016).

In contrast to past studies on changes in probabilities, we offer a novel way to investigate the effect of the observability of change, by making the probabilities fixed and the outcomes variable. Given that the outcome values are observed directly, we expect that changes in outcomes will be quicker and easier to detect compared to changes in probability. Thus, the type of change faced (whether in probabilities or outcome values) is expected to influence adaptation to different trends of change.

### Summary

To summarize, we extend current research by determining how the trend effect of change adaptation may be influenced by the type of feedback and the observability of the element of change (i.e., probability or outcome).

In Experiment 1, we experimentally manipulate the trend (direction) of the change in probability using a continuous linear function, and the type of feedback provided in choices between two options: one with stationary probabilities and one with non-stationary probabilities. In Experiment 2, we manipulate the trend of changing outcome values instead of probabilities to test the effect of directly observing change. Experiment 2 presents gamble pairs with constant probabilities and changing outcomes while keeping the overall expected values and manipulated variables (trend and feedback) similar to Experiment 1.

Given the literature reviewed above, we expect a main effect of the trend, where a change in probabilities (or outcomes) would be easier to detect when the changing decision option is decreasing (initially better but worse over time) rather than increasing (initially worse but better over time). We also expect a main effect of feedback, where full feedback would result in better adaptation to change compared to partial feedback. Finally, we expect that changing outcome values will facilitate adaptation and reduce the effect of trends compared to changing probabilities over time.

## Experiment 1: Detection of probability change

### Method

#### Participants

We tested 600 participants (M_{age} = 35.5, SD_{age}= 11.2 years; 323 female), recruited from Amazon Mechanical Turk. Participants were compensated solely based on their performance in the choice task (1 cent for every 100 points accumulated in the task), earning $2.60 on average (minimum = $1.65 and maximum = $4.85). No participants were excluded from analysis. The sample size was chosen based on the available budget.

#### Task and design

We used a repeated binary choice task commonly used in research on decisions from experience (Barron & Erev, 2003; Rakow & Miler, 2009). In this paradigm, participants make multiple choices between two buttons representing the gambles, a probability determines the outcome received after each option is selected, and immediate feedback is provided about the outcomes.

Participants were randomly assigned to one condition of a 3 × 2 between-participants design: a *trend* of probability (increasing, decreasing, or constant), and a type of *feedback* (partial or full). Participants were distributed about equally across the six conditions (*N* = 100 ± 1 for each). In each condition, participants completed 100 trials of the binary choice task. Both options were risky prospects with the same two possible outcomes (500 or 0 points), but one option was stationary (probabilities did not change over the course of 100 trials) and the other option was non-stationary (probabilities increased, decreased, or did not change, according to the experimental condition). In the stationary option, participants had a 0.5 probability of receiving the high-value outcome (500 points) and 0.5 probability of receiving the low-value outcome (0 points). In the non-stationary option, the probability of receiving the high-value outcome (500 points) increased, decreased, or stayed constant over the course of 100 trials (the probability of the low-value outcome, 0 points, was complementary).

Figure 1 illustrates the concept of partial and full feedback in an example of the increasing condition. Option A (stationary option) gives +500 points with 0.5 probability (0 otherwise), while Option B (non-stationary option) gives +500 with low probability initially but that probability increases over time. After each choice, participants may see the outcome of only their selected choice (*partial feedback*, Fig. 1, left panel) or may see the outcomes from both their selected choice and the forgone choice (*full feedback*, Fig. 1, right panel). Feedback refers to the realization of the outcomes and not to information about the gamble probabilities.

Figure 2 illustrates the trend in the probabilities of the high-value outcome in the three trend conditions (continuous line): increasing, decreasing, and constant. The figure also shows the probabilities of the stationary option (dashed line).

In the *increasing* condition, the probability of receiving 500 points in the non-stationary option started at .01, increased by .01 on each subsequent trial, and ended at 1 on the final trial, while 0 was obtained with the complementary probability. In the *decreasing* probability condition, the probability of receiving 500 points in the non-stationary option started at 1, decreased by .01 on each subsequent trial, and ended at .01 on the final trial, while 0 was obtained with the complementary probability. In the *constant* probability condition the probability of receiving 500 points in the non-stationary option was 0.5 for every trial, and 0 was obtained with 0.5 probability, matching exactly the stationary option in all the conditions. Both choice options in all conditions had an equivalent expected value over all 100 trials (EV_{stationary} = EV_{non-stationary} = (.50 × 0) + (.50 × 500) = 250 points).

Although the stationary and non-stationary options in all conditions had equal overall expected values, for each trial, the relative benefit of the non-stationary over the stationary option depended upon the condition of the probability trend. In the increasing condition the non-stationary option had lower expected value than the stationary option during trials 1–49; it had equal expected value as the stationary option at trial 50; and it had higher expected value than the stationary option during trials 51–100. The decreasing condition had the reverse pattern. We define trial 50 as the *switch point* at which the stationary and non-stationary options switched their relative expected value. In the constant condition neither option had a higher expected value than the other on any trial.

In this task, adaptation is operationalized as the proportion of expected-value-maximizing choices in all trials after the expected value switch point. Additionally, per-trial analyses of choices after the switch point will allow us to also consider the speed of adaptation in our results.

#### Procedure

Participants first provided informed consent and answered demographic questions. They received instructions about the task according to their feedback condition (see Appendix 24), and then made 100 choices between the left and right buttons randomly assigned to the stationary option (labeled “A”) and the non-stationary option (labeled “B”). The left-right position of the non-stationary option was counterbalanced. After completing the task, participants filled out a debriefing questionnaire to ascertain their awareness of changes in the underlying probabilities: 1) “Overall, which option do you think was the best (gave you the most points on average)?” with answer choices of “A,” “B,” and “Neither”; 2) “Within each block of rounds below, what do you think the relationship was between the options?”. For five prompts that split the 100 trials into blocks of 20 trials (e.g., “Rounds 1–20,” “Rounds 21–40,” etc.), they selected answers from “A and B gave me the same number of points, on average,” “A gave me more points than B, on average,” or “B gave me more points than A, on average.”

The datasets for Experiments 1 and 2 are both available in the Open Science Framework repository.^{Footnote 1} The experiments were not pre-registered.

### Results

#### Proportion of maximizing choices before and after the switch point

We investigated adaptation through the proportion of choices of the option with higher expected value (Max-rate) after the switch point, independent of whether an option was stationary or non-stationary. Figure 3 shows the overall and per-trial Max-rate before and after the switch point for the four experimental conditions. In the constant condition, neither choice option is better, therefore we removed this condition from these analyses.^{Footnote 2} Descriptively, these proportions suggest that participants are capable of selecting the better option before the switch point (i.e., Max-rate in all conditions is greater than 0.50), but have trouble selecting the better option after the switch point (i.e., Max-rates are at 0.51 and below on average). It is also possible to observe that before the switch point, the Max-rate is on average higher in the full feedback compared to the partial feedback condition in both the increasing and decreasing trends. After the switch point, however, full feedback does not seem to improve the Max-rate on average, as it seems affected more by trend; however, the speed of adaptation after the switch point may be affected by full feedback.

To assess these descriptive observations, we fit a generalized logit model^{Footnote 3} predicting the Max-rate by trend, feedback, trial, and before/after block, and by their two-way, three-way, and four-way interactions. This model was a significant improvement over the null (χ^{2}(15) = 3890.1, *p <* .001). The full set of regression coefficients (providing effect sizes) are reported in Appendix 25, Table 4. The analyses revealed a significant main effect of feedback χ^{2}(1) = 67.58, *p <* 0.001, where the Max-rate was higher with full feedback (M_{full} = 0.121 log-odds) than with partial feedback (M_{partial} = 0.033 log-odds). There was also a significant main effect of trend χ^{2}(1) = 23.24, *p* < 0.001, where the Max-rate was higher for the increasing condition (M_{increasing} = 0.118 log-odds) than the decreasing condition (M_{decreasing} = 0.035 log-odds). The main effect of block was significant, χ^{2}(1) = 2043.08, p < 0.001, where the Max-rate was higher before the switch point (M_{before} = 1.010 log odds) than after the switch point (M_{after} = -0.860 log odds). The main effect of trial was also significant, χ^{2}(1) = 570.48, *p* < 0.001.

The two-way interaction of trend and block was significant χ^{2}(1) = 73.82, *p <* 0.001, indicating an effect of trend on adaptation. Before the switch point, the Max-rate was higher in the increasing condition (M_{before-inc}=1.251 log odds) than in the decreasing condition (M_{before-dec} = 0.777 log odds). After the switch point, the Max-rate was instead higher in the decreasing condition (M_{after-dec} = -0.706 log odds) than the increasing condition (M_{after-inc} = -1.015 log odds). Additionally, the two-way interaction of feedback and block was significant χ^{2}(1) = 117.32, *p <* 0.001. Before the switch point, the Max-rate was higher with full feedback (M_{before-full} = 1.286 log odds) than with partial feedback (M_{before-partial} = 0.741 log odds), while after the switch point the Max-rate was higher with partial feedback (M_{after-partial} *=* -0.676 log odds) than with full feedback (M_{after-full} *=* -1.045 log odds). The interaction of trend and feedback was not significant (*p* > 0.05). The interaction of trend and trial was significant, χ^{2}(1) = 19.31, *p* < 0.001. The interaction of feedback and trial was significant, χ^{2}(1) = 17.10, *p* < 0.001. The interaction of block and trial was significant, χ^{2}(1) = 302.51, *p* < 0.001. Looking at these estimates for the effects of trial, we can see the speed of adaptation after the switch interacts with both feedback and trend: participants in the increasing condition adapt equally quickly with partial and full feedback, while participants in the decreasing, partial feedback condition adapt more slowly than those in the increasing conditions, and participants in the decreasing, full feedback condition adapt more quickly than those in the increasing condition.

The three-way interaction of trend, feedback, and block was also significant, χ^{2}(1) = 16.08, *p* < 0.001. The interaction of trend, feedback, and trial was significant, χ^{2}(1) = 38.93, *p* < 0.001. The interaction of trend, block, and trial was significant, χ^{2}(1) =15.45, *p* < 0.001. The interaction of feedback, block, and trial was significant, χ^{2}(1) = 10.27, *p* < 0.001. The four-way interaction between trend, feedback, block, and trial was not significant (*p* > 0.05).

Overall, these results indicate that adaptation to change is affected by trend and feedback. On average, full feedback helps participants choose the better option before the switch point; however, on average, it does not help after the switch point. On average, after the switch point, we observe more maximization in the decreasing condition than in the increasing condition. The speed of adaptation, however, does appear to be increased by the presence of full feedback, and the speed of adaptation also interacts with trend, such that decreasing, partial conditions show the slowest adaptation speed, and decreasing, full conditions show the highest.

#### Individual-level analyses of adaptation to change

To assess successful adaptation to change at the individual level, we examined the average Max-rate before and after the switch point per participant in each of the four changing conditions. Each dot in Fig. 4 illustrates each participant’s average Max-rate in trials before and after the switch point. The dashed lines separate the participants into four quadrants. Along the X-axis, participants were split into groups based on their Max-rate before the switch point (either a Max-rate ≤ 0.5 or a Max-rate > 0.5). Along the Y-axis, participants were split into groups based on their Max-rate after the switch point (either a Max-rate ≤ 0.5 or a Max-rate > 0.50).

For each observation, we used the quadrant to classify participants into four types of choice behavior. “Fortunate” respondents were participants who most often selected the option with lower expected value before the switch point and did not change after the switch point (luckily allowing them to maximize after the switch point). “Agile” respondents selected the best option both before and after the switch point, requiring a shift in preferred options to adapt to the change at the switch point. “Clumsy” respondents selected the lower expected value option before the switch point and then shifted to continue selecting the lower expected value option after the switch point; and “Rigid” respondents selected the maximizing option before the switch point and continued to prefer this same option after the switch point, unfortunately selecting the poor option after the change. In other words, Agile and Clumsy participants explicitly adapted to change, and Fortunate and Rigid participants did not.

Figure 4 provides exact percentages for each experimental condition, while Table 1 shows the percentage of participants of each type by averaged over trend levels, averaged over feedback levels, and the differences between the feedback and trend levels (respectively). A Pearson’s χ^{2} test of independence between experimental condition and the proportion of participants in each of the four maximization behavior categories was non-significant (χ^{2}(9) = 11.68, *p* = 0.23), indicating that the distribution of participants among these categories did not differ significantly between the four experimental conditions. Tables 6 and 7 in Appendix B provide the observed counts, expected counts, and standardized residuals for this analysis. Appendix B, Fig. 13 simulates how random responses would appear in this type of analysis.

Given the lack of evidence for an effect of experimental condition on behavior category, we consider the general patterns of behavior across experimental conditions. Descriptively, there are more non-adaptive than adaptive individuals in all conditions. Most of the adaptive participants are Agile (38%), not Clumsy (3.5%), and most of the non-adaptive participants are Rigid (48%), not Fortunate (10.5%). These results suggest general difficulty in adapting to change, and individual differences playing a role in whether decision makers adapted or did not. Especially consistent with stickiness, we see plenty of individual rigidity where participants failed to switch to the alternate option after the switch point, rather than individual fortune in selecting the worse option initially and continuing to select it as it became the better option.

#### Beliefs of maximization

Here we analyzed the way participants answered questions indicating their belief regarding the relative value of the two options (stationary and non-stationary). Figure 5 shows the distributions of beliefs about overall value for the different conditions. As observed, very few participants (less than 50% in all conditions) had the correct belief that the two options were of equal value overall (stationary = non-stationary), even in the constant condition. Thus, a large majority of participants thought (incorrectly) that the two options were different in value overall. As observed in Fig. 5, there was a general tendency to believe that the initially better option had the greater overall value (i.e., percent of stationary > non-stationary in the increasing condition and non-stationary > stationary in the decreasing condition), suggesting a primacy effect.

A Pearson’s χ^{2} test of independence between experimental condition and overall belief suggests that the distribution of overall belief varies significantly by experimental condition (χ^{2} (10) = 116.68, *p* < 0.001). Follow-up analyses use standardized Pearson residuals, and these residuals identify significant deviations from the expected cell counts: the cell counts expected if the distribution of beliefs was independent of experimental condition (Agresti, 2002). These analyses suggest significant cell count deviations for participants who (incorrectly) believed the options were unequal. More often than expected, these participants believed that the initially better option had the greater overall value. Beliefs that the S option was better than the NS option occurred less frequently than expected in both partial and full feedback, decreasing conditions, and more frequently than expected in both partial and full feedback, increasing conditions. Beliefs that the NS option was better than the S option occurred more frequently than expected in both partial and full feedback, decreasing conditions and less frequently than expected in both partial and full feedback, increasing conditions. In both partial and full feedback, constant conditions, beliefs that one option was better than the other did not occur significantly more or less frequently than expected if beliefs and condition were independent. Tables 10 and 11 in Appendix B provide the observed counts, expected counts, and standardized residuals for this analysis. Given the probabilistic nature of the task, an additional analysis is provided in Fig. 14 in the Appendix. This analysis takes into account the outcomes that participants actually experienced, which suggests that the beliefs of participants in partial feedback conditions may somewhat successfully track their experienced outcomes, but that participants in full feedback conditions still too-frequently believe that the initially better option was the better option overall.

### Experiment 1: Discussion

The results of Experiment 1 reveal the difficulty of adapting to the continuous change of probabilities in a simple, binary choice task. With both a main effect of trend and a trend and block interaction: on average, participants in the decreasing condition adapted more successfully to the change (after the switch point) than participants in the increasing condition, with further differences in the speed of adaptation influenced by trend and feedback. Also, on average, participants receiving full feedback made more maximizing choices before the switch point than participants receiving partial feedback, but feedback (on average) did not help after the switch point, although it did increase the speed of adaptation in the decreasing condition after the switch. This provides evidence for the existence of an asymmetry in adaptation based on the trend or direction of change and feedback.

Analysis of individual strategies and the post-survey beliefs about the expected values of the two options suggest “rigidity” as the most common problem with adaptation, where participants continue to choose the same option after the switch point, and where their beliefs suggest that the initially better option was considered the best option overall. Despite full feedback, it is possible that the change was very hard to detect because probability is not explicitly observed, but rather participants must infer a change in probability from the frequency of experiencing the high outcome. It is also possible that participants expected the initially better option to be the better option overall, and thus paid little attention to the frequency of experiencing the high outcome.

The question we address in Experiment 2 is whether making the object of change directly observable would improve the ability of participants to adapt to the environmental changes. In Experiment 2, the outcome value itself is the changing feature in the decision problem. We expect to see overall better adaptation to change when the object of change is made directly observable to the participants.

## Experiment 2: Detection of outcome change

### Method

####
*Participants*

We tested 603 participants (M_{age} = 34.2, SD_{age }= 10.6 years; 261 female), recruited from Amazon Mechanical Turk. Participants were compensated solely based on their performance in the choice task (1 cent for every 100 points accumulated in the task), earning $2.59 on average, (minimum = $1.76 and maximum = $4.62). No participants were excluded from analysis. The sample size was chosen based on the available budget.

#### Task and design

The design of this experiment was exactly the same as that of Experiment 1, except that the element of change in this experiment was the outcome (the observable element of the gamble), instead of the probability (the unobservable element of the gamble). Participants were distributed about equally across conditions: increasing, full feedback *N* = 98; increasing, partial feedback *N* = 101; decreasing, full feedback *N* = 101, decreasing, partial feedback *N* = 102; constant, full feedback *N* = 99, and constant partial feedback *N* = 102.

To keep the gambles of this experiment equivalent to those in Experiment 1, both options were risky prospects with a low-value outcome of 0 points and a high-value outcome of up to 1,000 points, and one option was stationary (the outcome values did not change over the course of 100 trials) and the other option was non-stationary (the outcome values increased, decreased, or stayed constant, according to the experimental condition). For all conditions, the stationary option was the same as in Experiment 1, offering a high outcome (500 points) and a low outcome (0 points) with 0.5 probabilities. In the *increasing* condition, the non-stationary option’s high-value outcome increased from 10 to 1,000, with ten additional points in each subsequent trial. In the *decreasing* condition, the non-stationary option’s high-value outcome decreased from 1,000 to 10, with ten less points in each subsequent trial. In the *constant* condition the non-stationary option’s high-value outcome stayed constant at 500 over the course of 100 trials. Figure 6 illustrates examples of partial and full feedback with changing outcome values, and Fig. 7 illustrates these values in each of the three trend conditions.

Thus, Experiment 2 was objectively equivalent to Experiment 1 in terms of expected values; the relative benefits of the non-stationary over the stationary option were also equivalent to those in Experiment 1. In the increasing condition, the non-stationary option was worse in expected value than the stationary option before the switch point (trial 50), and better than the stationary option after the switch point. In the decreasing condition the non-stationary option was better than the stationary option before the switch point and worse than the stationary option after the switch point. The relative benefit of the two options was the same before and after the switch point in the constant condition.

The procedures were identical to Experiment 1.

### Results

Given the similarities between experiments, we conducted the same analyses as in Experiment 1. In what follows, we summarize the results for the proportion of maximizing choices,^{Footnote 4} the individual-level analyses of maximization behavior and the analyses of beliefs. Supplemental results from the statistical analyses conducted can be found in Appendix 25.

As in Experiment 1, we analyzed the proportion of choices of the best option (Max-rate) in the increasing and decreasing conditions both per-trial and before and after the switch point (see Fig. 8). We observed multiple patterns similar to Experiment 1. Participants are generally capable of selecting the best option before the switch point (i.e., Max-rate in all conditions is greater than 0.50). However, in most conditions (expect for the decreasing with partial feedback condition) they have trouble adapting their choices after the switch point, even when the outcomes were directly observed and even with full feedback.

We fit a generalized logit model predicting the Max-rate by trend, feedback, trial, before/after block, and their interactions.^{Footnote 5} This model was a significant improvement over the null (χ^{2}(15) = 2216.9, *p <* .001). The full set of regression coefficients (providing effect sizes) are provided in Appendix B, Table 5. The analyses revealed a significant main effect of feedback χ^{2}(1) = 357.25, *p <* 0.001, where the Max-rate was higher with full feedback (M_{full} = 0.434 log odds) than with partial feedback (M_{partial} = 0.134 log odds). There was also a significant main effect of trend χ^{2}(1) =11.11, *p* < 0.001, where the Max-rate was higher for increasing (M_{increasing} = 0.317 log odds) than for decreasing (M_{decreasing} = 0.250 log odds). The main effect of block was significant, χ^{2}(1) = 990.28, p < 0.001, where the Max-rate was higher before the switch point (M_{before} = 0.948 log odds) than after the switch point (M_{after} = -0.380). The main effect of trial was also significant, χ^{2}(1) = 548.70, *p* < 0.001.

The two-way interaction of trend and block was significant χ^{2}(1) = 51.88, *p <* 0.001, indicating an effect of trend on adaptation. Before the switch point, the Max-rate was higher in the increasing condition (M_{before-inc} = 1.130 log odds) than in the decreasing condition (M_{before-dec} = 0.766 log odds). After the switch point, the Max-rate was instead higher in the decreasing condition (M_{after-dec} = -0.266 log odds) than the increasing condition (M_{after-inc} = -0.495). Additionally, the two-way interaction of feedback and block was significant χ^{2}(1) = 54.98, *p <* 0.001. Before the switch point, the Max-rate was higher with full feedback (M_{before-full} = 1.243 log odds) than with partial feedback (M_{before-partial} = 0.652 log odds), while after the switch point Max-rates were comparable (M_{after-full} = -0.376 log odds, M_{after-partial} = -0.385 log odds). The interaction of trend and feedback was also significant χ^{2}(1) = 101.91, *p <* 0.001. In the increasing condition, full feedback resulted in a higher Max-rate (M_{full-inc} = 0.478 log odds, M_{partial-inc} = 0.157 log odds) than in the decreasing condition (M_{full-dec} = 0.389 log odds, M_{partial-dec} = 0.110 log odds). The interaction of trend and trial was significant, χ^{2}(1) = 5.99, *p* = 0.01. The interaction of feedback and trial was significant, χ^{2}(1) = 28.02, *p* < 0.001. The interaction of block and trial was significant, χ^{2}(1) = 73.27, *p* < 0.001. Looking at the estimates for the effects of trial, we can see the speed of adaptation after the switch interacts with both feedback and trend. Participants with full feedback adapt more quickly than participants with partial feedback. Participants in the decreasing conditions adapt more quickly than those in increasing conditions; however, this effect of trend is smaller in the full feedback condition that the partial feedback condition, indicating the helpful effects of feedback appear to be in the speed of adaptation.

The three-way interaction of trend, feedback, and block was also significant, χ^{2}(1) = 41.34, *p* < 0.001. The interaction of trend, feedback, and trial was not significant, *p* > 0.05. The interaction of trend, block, and trial was significant, χ^{2}(1) =11.65, *p* < 0.001. The interaction of feedback, block, and trial was significant, χ^{2}(1) = 6.49, *p* = 0.01. The four-way interaction between trend, feedback, block, and trial was significant, χ^{2}(1) = 22.09, *p* < 0.001.

These results suggest that, like in Experiment 1, many participants successfully maximized before the switch point, but on average most had difficulty adapting to the change, with the exception of participants in the decreasing, partial condition. On average, participants in the decreasing conditions showed comparable (full feedback) or better (partial feedback) adaptation to the change than participants in the increasing conditions. Once again, the speed of adaptation interacts with both feedback and trend, with full feedback increasing adaptation speed, and a slight improvement of speed due for decreasing trend compared to increasing trend.

#### Individual-level analyses of adaptation to change

As in Experiment 1, we investigate adaptation to change at the individual level using the Max-rate before and after the switch point per participant in each of the four experimental conditions. Again, the quadrants show a classification of the participants into: Fortunate, Agile, Clumsy, and Rigid choice patterns, where Agile and Clumsy are the participants that adapted to change by explicitly switching their choices to the most rewarding option after the switch point, while Fortunate and Rigid participants did not. Figure 9 shows the individual Max-rates by condition and indicates the observations for each quadrant (maximization behavior pattern). Table 2 shows the percent of participants of each type by experimental condition.

A Pearson’s χ^{2} test of independence between experimental condition and category of maximization behavior suggests the distribution of participants among the four categories of maximization behavior varies significantly with the experimental condition (χ^{2}(9) = 52.38, *p* < 0.001). The residuals indicate that there were fewer Agile participants and more Rigid participants than expected in the increasing, partial feedback condition, and also fewer Fortunate participants than expected in the increasing, full feedback condition – consistent with aggregate-level observations that the increasing, partial feedback condition is associated with more adaptation difficulty. Also consistent with aggregate-level observations of better adaptation in the decreasing condition, the residuals indicate that there were more Fortunate participants and fewer Rigid participants than expected in the decreasing, partial feedback condition. Tables 8 and 9 in Appendix B provide the complete set of observed counts, expected counts, and standardized residuals. Thus we see more adaptive participants in the decreasing (49%) than increasing (43%) conditions, and more adaptive participants in the full feedback (57%) than the partial (36%) feedback conditions. The number of adaptive and non-adaptive individuals appears to depend on both the trend and the feedback. As in Experiment 1, descriptively over all conditions, most adaptive participants were Agile (40%), not Clumsy (7%); and most non-adaptive participants were Rigid (36%), not Fortunate (17%). These results suggest difficulty to adapt to change, slightly more so in the increasing than the decreasing condition; with greater difficulty to adapt to change for partial than for full feedback. The lack of adaptation is, again, likely due to individual rigidity, where participants failed to switch to the alternate option after the switch point, and not because they were fortunate (selecting the worse option initially and continuing to select the initially worse option after the switch point as it became the better option).

#### Beliefs of maximization

Figure 10 shows the distributions of beliefs about the overall values of the two options, for each of the different experimental conditions. As observed in Experiment 1, very few participants (less than 50% in all conditions) had the correct belief that the two options were of equal value overall, even in the constant condition. A Pearson’s χ^{2} test of independence between experimental condition and overall belief suggests that the distribution of overall belief varies significantly by experimental condition (χ^{2} (10) = 41.67, *p* < 0.001). Follow-up analyses use standardized Pearson residuals to identify significant departures from the expected counts (expected if the distribution of beliefs was independent of experimental condition; Agresti, 2002). These analyses find beliefs that the stationary option was better than the non-stationary option more often than expected for the partial feedback, increasing condition, and a trend in this direction for the full feedback, increasing condition. In contrast, beliefs that the stationary option was better occurred less often than expected in the full feedback, decreasing condition, and trended in this direction for the partial feedback, constant condition. Beliefs that the non-stationary option was better than the stationary option occurred more frequently than expected in the full feedback, decreasing condition. Additionally, we see a departure from the patterns of Experiment 1 for the decreasing, partial feedback condition: the proportion of beliefs that the stationary option is better than the nonstationary option and the proportion of beliefs that the nonstationary option is better than the stationary option were not significantly different from expected (the null hypothesis that there was no effect of condition on the distribution of beliefs). Descriptively, we see a hint of a recency effect, with slightly more frequent beliefs that the stationary option (the initially worse option) was better overall than beliefs that the nonstationary was better than the stationary option overall (though the frequencies of both beliefs are similar). Finally, beliefs that the two options were equal in expected value trended in the direction of less than expected in the partial feedback, increasing condition and trended in the direction of more than expected in the full feedback, constant condition. Tables 12 and 13 in Appendix B provide the observed counts, expected counts, and standardized residuals for this analysis. Given the probabilistic nature of the task, an additional analysis is provided in Fig. 15 in the Appendix. This analysis takes into account the outcomes that participants actually experienced and, like in Experiment 1, suggests participants in the partial feedback conditions may somewhat successfully track their experienced outcomes, but that participants in full feedback conditions still too-frequently believe that the initially better option was the better option overall.

## Cross-experiment comparison

Table 3 summarizes the results of the Max-rate across both experiments. First, in both experiments there was a main effect of feedback, such that full feedback leads to significantly more maximizing choice than partial feedback; and a significant main effect of block, where Max-rate is higher before the switch point (Block 1) than after it (Block 2), clearly exposing the general difficulty of adaptation to change. Second, in both experiments there was a consistent significant interaction between trend and block, such that there was better adaptation on average (maximization over all trials after the switch point) in the decreasing than the increasing trend.

The main differences in Max-rate between the experiments are in several interactions. First, Experiment 2 finds a significant interaction between feedback and trend conditions, while Experiment 1 did not. In Experiment 2, the maximization rate was higher for full feedback than for partial in both increasing and decreasing conditions; however, the difference between full and partial was greater in the increasing condition than in the decreasing condition. Second, Experiment 1 found a significant interaction between Feedback, Trend, and Trial, while Experiment 2 did not. Third, Experiment 2 found a significant interaction between Feedback, Trend, Block, and Trial, while Experiment 1 did not. Additionally, Experiments 1 and 2 both found a significant interaction between feedback and block. However, in Experiment 1 this interaction was a crossover, as before the switch point, the Max-rate was higher with full feedback than with partial feedback, while the after the switch point Max-rate was higher with partial feedback than with full feedback. In contrast, in Experiment 2, before the switch point, the Max-rate was higher with full feedback than with partial feedback, while after the switch point Max-rates were comparable.

## General discussion

We found a robust effect of trend on adaptation to change. Participants selected the maximizing option more often after the switch point when the non-stationary option *decreased* in value over time than when it *increased* in value, showing trend’s influence on adaptation. This *trend effect* in adaptation is consistent with the information asymmetry created by a *stickiness effect*: participants more often selected the choice option providing higher rewards early on, suggesting the increasing and decreasing conditions differed in whether information about the change (the non-stationary option) was available or salient. We offer two factors that moderate the trend effect: (a) the observability of the forgone payoffs (full vs. partial feedback) and (b) the observability of the element of change (i.e., the outcomes vs. the unobservable probabilities of the outcomes).

First, a trend effect for adaptation with partial feedback may be explained by the same mechanisms driving the “hot stove effect” (Denrell & March, 2001; Rich & Gureckis, 2019) and choice patterns seen in previous dynamic decision-making studies (Cheyette et al., 2016; Konstantinidis, Harman, & Gonzalez, 2022). Specifically, in partial feedback conditions, participants only experience the outcomes of the choice options they select. Therefore, participants may acquire more information about an option that is initially favorable than about an option that is initially unfavorable. In the case of the increasing condition – where the non-stationary option is initially unfavorable – many participants selected the favorable, stationary option more often, limiting their ability to learn about the improving value of the non-stationary option.

Similar factors may occur even with full feedback. We find that full feedback is essential for selecting the best option in early trials, but it is not enough for successful adaptation. That is, maximization frequency was higher with full feedback than with partial feedback but only before the switch in the relative expected value of the options. After the switch, full feedback only influenced speed of adaptation. This suggests that full feedback must be considered carefully as a method to help individuals successfully adapt to change. Depending on the number of decisions (or amount of time) after a change, a decision maker with full feedback may not be quick enough to counter the “disadvantage” provided by entering a changed environment with a strong preference for a previously better option, even though we would expect more information to improve a participant’s ability to detect and adapt to change.

These results could be explained by individual differences in memory weight or attention given to the payoff of the selected option versus the forgone payoff, as suggested in previous work (Lejarraga et al., 2014; Rakow & Miler, 2009). If one weighs the outcomes of the option selected more than the outcomes of the option not selected, this could also lead to stickiness in the selection of the initially more favorable option and to an inability to adapt when these outcomes change, even when feedback about both options is provided. Regarding attention, Ashby and Rakow (2016), using eye tracking, found that participants may fail to attend to forgone more than obtained outcomes, especially after a long sequence of trials. Thus, a full feedback condition may have the same effects as the partial feedback condition as the task progresses, explaining why people might fail to identify changes that occur after the switch.

Second, our results also suggest that stickiness effects depend on the direct observability of the element of change. More participants were classified as “agile” (respondents who, on average, selected the best option both before and after the switch point) in Experiment 2, when the outcome rather than the probability changed, and with full feedback (from 21.8% to 50% in the Increasing condition, and from 38.2% to 48.5% in the Decreasing condition). In contrast, in Experiment 1 full feedback increased the proportion of individuals classified as “rigid” (respondents who, on average, selected the maximizing option before the switch point and continued to prefer this same option after the switch point; from 49.5% to 57% in the increasing condition and from 40.6% to 44.4% in the decreasing condition). Thus, the main conclusion is that full feedback facilitates adaptation to change only when the object of change is fully observable.

In summary, we found evidence that decision makers are more successful at adapting to the decreasing expected value of an option than the increasing expected value, with or without counterfactual feedback about outcomes, and whether the element of change is indirectly or directly observable. These findings are consistent with explanations of over-reliance (through choice selection, memory, attention, or weighting processes) on initial experiences versus later (recent) experiences. These findings also suggest the importance of memory-based computational theories of dynamic decision making that can help clarify how individual decision makers treat early and later experiences when forming an understanding of dynamic environments, such as whether early versus later experiences are more salient, are given more weight or more attention in decision making (Konstantinidis et al., 2022). Such mechanisms may interact with different patterns of change to result in better or worse adaptation. While our results provide an initial step to elucidate the dynamics of the memory balances between stickiness and recency effects, there is substantial need to advance our understanding of these dynamic effects in individuals’ (and groups’ and organizations’) abilities to successfully adapt to changing conditions in the environment. Furthermore, our research studied a pattern of change that was immediate, linear, and continuous; but to assess the generality and robustness of our findings, it is essential to systematically investigate the effects of other dynamic patterns of change that would give rise to the general effect of over-reliance on initial or more recent experiences.

## Notes

The rates of non-stationary choice (NS-rate) for all six experimental conditions are provided in the Appendix, Fig. 11.

A mixed-effects model (with random intercepts per participant) was originally estimated; however, the variance of this random effect was less than 1.0 and convergence issues were encountered, so here we report the results of estimating a solely fixed-effects model.

The rates of non-stationary choice (NS-rate) for all six experimental conditions are provided in the Appendix, Fig. 12.

A mixed-effects model (with random intercepts per participant) was originally estimated; however, as in Experiment 1, the variance of this random effect was less than 1.0 and convergence issues were encountered, so here we report the results of estimating a solely fixed-effects model.

## References

Agresti, A. (2002).

*Categorical data analysis*(2nd ed.). Wiley-Interscience.Ashby, N. J. S., & Rakow, T. (2016). Eyes on the Prize? Evidence of Diminishing Attention to Experienced and Foregone Outcomes in Repeated Experiential Choice.

*Journal of Behavioral Decision Making, 29*(2–3), 183–193. https://doi.org/10.1002/bdm.1872Avrahami, J., Kareev, Y., & Fiedler, K. (2016). The dynamics of choice in a changing world: Effects of full and partial feedback.

*Memory & Cognition*. https://doi.org/10.3758/s13421-016-0637-4Barron, G., & Erev, I. (2003). Small feedback-based decisions and their limited correspondence to description-based decisions.

*Journal of Behavioral Decision Making, 16*(3), 215–233. https://doi.org/10.1002/bdm.443Behrens, T. E. J., Woolrich, M. W., Walton, M. E., & Rushworth, M. F. S. (2007). Learning the value of information in an uncertain world.

*Nature Neuroscience, 10*(9), 1214–1221. https://doi.org/10.1038/nn1954Camilleri, A. R., & Newell, B. R. (2011). When and why rare events are underweighted: A direct comparison of the sampling, partial feedback, full feedback and description choice paradigms.

*Psychonomic Bulletin & Review, 18*(2), 377–384. https://doi.org/10.3758/s13423-010-0040-2Cheyette, S., Konstantinidis, E., Harman, J., & Gonzalez, C. (2016). Choice adaptation to increasing and decreasing event probabilities. In

*Proceedings of the 38th Annual Conference of the Cognitive Science Society*.Denrell, J., & March, J. G. (2001). Adaptation as information restriction: The hot stove effect.

*Organization Science,**12*(5), 523–538. https://doi.org/10.1287/orsc.12.5.523.10092.Erev, I., & Barron, G. (2005). On adaptation, maximization, and reinforcement learning among cognitive strategies.

*Psychological Review,**112*(4), 912–931. https://doi.org/10.1037/0033-295X.112.4.912.Gonzalez, C., Lerch, J. F., & Lebiere, C. (2003). Instance-based learning in dynamic decision making.

*Cognitive Science, 27*(4), 591–635. https://doi.org/10.1207/s15516709cog2704_2Kellen, D., Pachur, T., & Hertwig, R. (2016). How (in)variant are subjective representations of described and experienced risk and rewards?

*Cognition, 157*, 126–138. https://doi.org/10.1016/j.cognition.2016.08.020Konstantinidis, E., Harman, J. L. & Gonzalez, C. (2022). Patterns of choice adaptation in dynamic risky environments.

*Memory & Cognition 50*, 864–881. https://doi.org/10.3758/s13421-021-01244-4Lejarraga, T., Lejarraga, J., & Gonzalez, C. (2014). Decisions from experience: How groups and individuals adapt to change.

*Memory & Cognition, 42*(8), 1384–1397. https://doi.org/10.3758/s13421-014-0445-7Lohrenz, T., McCabe, K., Camerer, C. F., & Montague, P. R. (2007). Neural signature of fictive learning signals in a sequential investment task.

*Proceedings of the National Academy of Sciences, 104*(22), 9493–9498. https://doi.org/10.1073/pnas.0608842104Rakow, T., & Miler, K. (2009). Doomed to repeat the successes of the past: History is best forgotten for repeated choices with nonstationary payoffs.

*Memory & Cognition, 37*(7), 985–1000. https://doi.org/10.3758/MC.37.7.985Rakow, T., Newell, B. R., & Wright, L. (2015). Forgone but not forgotten: The effects of partial and full feedback in “harsh” and “kind” environments.

*Psychonomic Bulletin & Review, 22*(6), 1807–1813. https://doi.org/10.3758/s13423-015-0848-xRich, A. S., & Gureckis, T. M. (2019). Lessons for artificial intelligence from the study of natural stupidity.

*Nature Machine Intelligence, 1*(4), 174. https://doi.org/10.1038/s42256-019-0038-zWulff, D. U., Mergenthaler-Canseco, M., & Hertwig, R. (2018). A meta-analytic review of two modes of learning and the description-experience gap.

*Psychological Bulletin, 144*(2), 140–176. https://doi.org/10.1037/bul0000115Yechiam, E., & Rakow, T. (2012). The effect of foregone outcomes on choices from experience: An individual-level modeling analysis.

*Experimental Psychology, 59*(2), 55–67. https://doi.org/10.1027/1618-3169/a000126

## Acknowledgements

This research was supported by the National Science Foundation Award number 1530479; by AFRL Award FA8650-20-F-6212 sub-award number 1990692; by the DARPA-ASIST program Award number FP00002636; and by ENM’s appointment to the Postgraduate Research Participation Program at the US Air Force Research Laboratory, administered by the Oak Ridge Institute for Science and Education through an interagency agreement between the US Department of Energy and USAFRL. ENM was also supported by the US Air Force Research Laboratory’s 711th Human Performance Wing, through the Personalized Learning and Readiness Sciences Core Research Area. The views expressed in this paper are those of the authors and do not reflect the official policy or position of the US Air Force, Department of Defense, or the US Government.

## Author information

### Authors and Affiliations

### Contributions

ENM: contributed to the development of the research questions, the methodological design and the design and development of experimental materials, performed data collection and analysis, and contributed to writing the original draft, reviewing and editing the manuscript. SJC: contributed to the development of the research questions, the methodological design and the design and development of experimental materials, assisted with data collection and analysis, and contributed to review and editing of the manuscript. CG: contributed to the initial ideas to outline the research and the development of the research questions; she defined the methodological design and the design of experimental materials, supervised data collection and analysis, contributed to writing the original draft, reviewing and editing the manuscript, and acquired financial support for the project.

### Corresponding author

## Additional information

### Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Appendices

### Appendix A. Task Instructions

**Experiment 1 task instructions**

“In this study you will be making multiple choices between two different options marked as two buttons, A and B. After you select one of the buttons you will receive information about the choice you made and the outcome that you obtained from your choice. *In addition, you will also see the outcome that you would have obtained had you chosen the other option.*

The outcome from your choice will accumulate in your total number of points. You will be paid 1 cent for every 100 points you earn. For example, if you win 15,000 points in this portion of the experiment, you will receive $1.50. You can receive up to $5.00, depending on your performance. The average participant earns approximately $2.50.”

Demographic information collected included Age, Gender, Country of Residence, Education Level, and an optional “Major/Field of Study.”

### Appendix B. Supplemental analyses

**Max-rate regression tables**

**Individual level maximization behavior patterns supplemental tables**

**Overall maximization beliefs supplemental analyses**

**NS-rate across conditions**

**Random responding simulation**

**Beliefs and experienced choice option values**

## Rights and permissions

## About this article

### Cite this article

McCormick, E.N., Cheyette, S.J. & Gonzalez, C. Choice adaptation to changing environments: trends, feedback, and observability of change.
*Mem Cogn* **50**, 1486–1512 (2022). https://doi.org/10.3758/s13421-022-01313-2

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.3758/s13421-022-01313-2