Humans Adopt Different Exploration Strategies Depending on the Environment

Ferguson, Thomas D.; Fyshe, Alona; White, Adam; Krigolson, Olave E.

doi:10.1007/s42113-023-00178-1

Humans Adopt Different Exploration Strategies Depending on the Environment

Original Paper
Published: 15 August 2023

Volume 6, pages 671–696, (2023)
Cite this article

Computational Brain & Behavior Aims and scope Submit manuscript

Thomas D. Ferguson ORCID: orcid.org/0000-0003-1909-2366^1,2,3,
Alona Fyshe^1,2,4,
Adam White^1,2 &
…
Olave E. Krigolson³

310 Accesses
Explore all metrics

Abstract

Humans explore to learn the structure of our environment. However, it remains unclear how consistent humans are in the exploration strategies we use and how often we explore across different environments which vary in their volatility. Using a within-subjects design, participants (n = 30) completed (1) a non-stationary bandit task where the reward values changed throughout, and (2) a stationary bandit task where one option always gave better reward. We used a series of reinforcement learning models to understand the exploration strategies humans adopted in the two tasks. We found that most participants adopted a behavioural heuristic strategy (Win-Stay, Lose-Shift) in the non-stationary bandit task. In contrast, most participants adopted a probabilistic, random exploration strategy (Softmax) in the stationary bandit task. We compared our results when fitting models individually within each task to when fitting models across both tasks—that is focusing on long-term predictions. When fitting across both tasks we found that most participants solely adopted a probabilistic, random exploration strategy. In addition, we found a moderate, positive relationship between exploration rate in each of the two bandit tasks. Our findings show that humans can flexibly adopt different exploration strategies depending on task demands, which we suggest is because the two bandit tasks assessed different aspects of learning and required different levels of cognitive flexibility. In addition, we speculate that the relationship between exploration rate could reflect a personality trait such as risk-taking. In sum, we found evidence for the flexible use of exploration strategies, while also observing evidence of the generalization of exploration across tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Active inference and the two-step task

Article Open access 21 October 2022

Aversion to Option Loss in a Restless Bandit Task

Article 01 June 2018

Learning the value of information and reward over time when solving exploration-exploitation problems

Article Open access 05 December 2017

Data Availability

The datasets generated are available from the corresponding author upon reasonable request. Datasets are not publicly available because participants did not consent for their data to be shared in a public repository. Modeling and analysis code will be published on: https://github.com/tomferg/BanditComp

Code Availability

Code generated during modeling and analysis will be published on the following Github link: https://github.com/tomferg/BanditComp

Notes

In the non-stationary task, we always divided the points values obtained by 100 for all models where reward estimates were required (ε-Greedy, Softmax, Sliding Window Upper Confidence Bound, Gradient; Kalman Filter with Thompson Sampling).

References

Agrawal, R. (1995). Sample mean based index policies by o (log n) regret for the multi-armed bandit problem. Advances in Applied Probability, 27(4), 1054–1078.
Article Google Scholar
Ahn, W. Y., Busemeyer, J. R., Wagenmakers, E. J., & Stout, J. C. (2008). Comparison of decision learning models using the generalization criterion method. Cognitive Science, 32(8), 1376–1402. https://doi.org/10.1080/03640210802352992
Article PubMed Google Scholar
Auer, P. (2002). Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3, 397–422. https://doi.org/10.4271/610369
Article Google Scholar
Barron, G., & Erev, I. (2003). Small feedback-based decisions and their limited correspondence to description-based decisions. Journal of Behavioral Decision Making, 16(3), 215–233. https://doi.org/10.1002/bdm.443
Article Google Scholar
Barto, A. G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(1–2), 41–77.
Article Google Scholar
Behrens, T. E. J., Woolrich, M. W., Walton, M. E., & Rushworth, M. F. S. (2007). Learning the value of information in an uncertain world. Nature Neuroscience, 10(9), 1214–1221. https://doi.org/10.1038/nn1954
Article PubMed Google Scholar
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300.
Article Google Scholar
Bennett, D., Niv, Y., & Langdon, A. J. (2021). Value-free reinforcement learning: Policy optimization as a minimal model of operant behavior. In Current Opinion in Behavioral Sciences, 41, 114–121. https://doi.org/10.1016/j.cobeha.2021.04.020. Elsevier Ltd.
Article Google Scholar
Berridge, K. C. (2000). Reward learning: Reinforcement, incentives, and expectations. Psychology of Learning and Motivation - Advances in Research and Theory, 40, 223–278. https://doi.org/10.1016/s0079-7421(00)80022-5
Article Google Scholar
Berry, D. A., & Fristedt, B. (1985). Bandit Problems. Chapman and Hall.
Book Google Scholar
Bonawitz, E., Denison, S., Gopnik, A., & Griffiths, T. L. (2014). Win-Stay, Lose-Sample: A simple sequential algorithm for approximating Bayesian inference. Cognitive Psychology, 74, 35–65. https://doi.org/10.1016/j.cogpsych.2014.06.003
Article PubMed Google Scholar
Botvinick, M. M. (2012). Hierarchical reinforcement learning and decision making. Current Opinion in Neurobiology, 22(6), 956–962. https://doi.org/10.1016/j.conb.2012.05.008
Article PubMed Google Scholar
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10(4), 433–436.
Article PubMed Google Scholar
Brändle, F., Binz, M., & Schulz, E. (2022). Exploration beyond bandits. In Cogliati Dezza, I., Schulz, E., & Wu, C.M. (eds.) The Drive for Knowledge (pp. 147–168). Cambridge University Press. https://doi.org/10.1017/9781009026949.008
Brown, V. M., Hallquist, M. N., Frank, M. J., & Dombrovski, A. Y. (2022). Humans adaptively resolve the explore-exploit dilemma under cognitive constraints: Evidence from a multi-armed bandit task. Cognition, 229. https://doi.org/10.1016/j.cognition.2022.105233
Browning, M., Behrens, T. E., Jocham, G., O’Reilly, J. X., & Bishop, S. J. (2015). Anxious individuals have difficulty learning the causal statistics of aversive environments. Nature Neuroscience, 18(4), 590–596. https://doi.org/10.1038/nn.3961
Article PubMed PubMed Central Google Scholar
Busemeyer, J. R., & Wang, Y. M. (2000). Model comparisons and model selections based on generalization criterion methodology. Journal of Mathematical Psychology, 44(1), 171–189.
Article PubMed Google Scholar
Cavanagh, J. F., & Frank, M. J. (2014). Frontal theta as a mechanism for cognitive control. Trends in Cognitive Sciences, 18(8), 414–421. https://doi.org/10.1016/j.tics.2014.04.012
Article PubMed PubMed Central Google Scholar
Cohen, J. D., McClure, S. M., & Yu, A. J. (2007). Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philosophical Transactions of the Royal Society b: Biological Sciences, 362(1481), 933–942. https://doi.org/10.1098/rstb.2007.2098
Article Google Scholar
Costa, V. D., Dal Monte, O., Lucas, D. R., Murray, E. A., & Averbeck, B. B. (2016). Amygdala and ventral striatum make distinct contributions to reinforcement learning. Neuron, 92(2), 505–517. https://doi.org/10.1016/j.neuron.2016.09.025
Article PubMed PubMed Central Google Scholar
Dammhahn, M., & Almeling, L. (2012). Is risk taking during foraging a personality trait? A field test for cross-context consistency in boldness. Animal Behaviour, 84(5), 1131–1139. https://doi.org/10.1016/j.anbehav.2012.08.014
Article Google Scholar
Daw, N. D. (2011). Trial-by-trial data analysis using computational models. Decision Making, Affect, and Learning: Attention and Performance, XXIII, 1–26. https://doi.org/10.1093/acprof:oso/9780199600434.003.0001
Article Google Scholar
Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876–879. https://doi.org/10.1038/nature04766
Article PubMed PubMed Central Google Scholar
Dayan, P. (2013). Exploration from generalization mediated by multiple controllers. In Baldassarre, G., & Mirolli, M. (eds.), Intrinsically Motivated Learning in Natural and Artificial Systems (pp. 73–91). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-32375-1
Dayan, P., & Yu, A. J. (2006). Phasic norepinephrine: A neural interrupt signal for unexpected events. Network: Computation in Neural Systems, 17(4), 335–350. https://doi.org/10.1080/09548980601004024
Article PubMed Google Scholar
Diuk, C., Tsai, K., Wallis, J., Botvinick, M., & Niv, Y. (2013). Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia. Journal of Neuroscience, 33(13), 5797–5805. https://doi.org/10.1523/JNEUROSCI.5445-12.2013
Article PubMed Google Scholar
Dubois, M., & Hauser, T. U. (2022). Value-free random exploration is linked to impulsivity. Nature Communications, 13(1). https://doi.org/10.1038/s41467-022-31918-9
Eckstein, M. K., Master, S. L., Xia, L., Dahl, R. E., Wilbrecht, L., & Collins, A. (2022). The interpretation of computational model parameters depends on the context. eLife, 11, 75474. https://doi.org/10.7554/eLife
Article Google Scholar
Feher da Silva, C., Lombardi, G., Edelson, M., & Hare, T. A. (2023). Rethinking model-based and model-free influences on mental effort and striatal prediction errors. Nature Human Behaviour, 7(6), 956–969. https://doi.org/10.1038/s41562-023-01573-1
Article PubMed Google Scholar
Ferguson, T. D., Bub, D. N., Masson, M. E. J., & Krigolson, O. E. (2021). The role of cognitive control and top-down processes in object affordances. Attention, Perception, and Psychophysics, 83(5), 2017–2032. https://doi.org/10.3758/s13414-021-02296-z
Article Google Scholar
Fernie, G., & Tunney, R. J. (2006). Some decks are better than others: The effect of reinforcer type and task instructions on learning in the Iowa Gambling Task. Brain and Cognition, 60(1), 94–102. https://doi.org/10.1016/j.bandc.2005.09.011
Article PubMed Google Scholar
Fitts, P. M., & Seeger, C. M. (1953). S-R compatibility: spatial characteristics of stimulus and response codes. Journal of Experimental Psychology, 46(3), 199–210.
Article PubMed Google Scholar
Garivier, A., & Moulines, E. (2008). On upper-confidence bound policies for non-stationary bandit problems. http://arxiv.org/abs/0805.3415
Gershman, S. J. (2019). Uncertainty and exploration. Decision, 6(3), 277–286. https://doi.org/10.1037/dec0000101.Uncertainty
Article PubMed Google Scholar
Gittins, J. C., & Jones, D. M. (1974). A dynamic allocation index for the sequential design of experiments. In J. Gani, K. Sarkadi, & I. Vincze (Eds.), Progress in Statistics (pp. 241–266). North-Holland.
Google Scholar
Guo, D., & Yu, A. J. (2018). Why so gloomy? A Bayesian explanation of human pessimism bias in the multi-armed bandit task. Advances in Neural Information Processing Systems, 31.
Hassall, C. D. (2019). The neural correlates of exploration. (Doctoral dissertation, University of Victoria).
Hassall, C. D., & Krigolson, O. E. (2020). Neuropsychologia feedback processing is enhanced following exploration in continuous environments. Neuropsychologia, 146, 107538. https://doi.org/10.1016/j.neuropsychologia.2020.107538
Article PubMed Google Scholar
Hayden, B. Y., & Niv, Y. (2021). The case against economic values in the orbitofrontal cortex (or anywhere else in the brain). Behavioral Neuroscience, 135(2), 192.
Article PubMed Google Scholar
Holroyd, C. B., & Coles, M. G. H. (2002). The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109(4), 679–709. https://doi.org/10.1037/0033-295X.109.4.679
Article PubMed Google Scholar
Holroyd, C. B., & Yeung, N. (2012). Motivation of extended behaviors by anterior cingulate cortex. Trends in Cognitive Sciences, 16(2), 122–128. https://doi.org/10.1016/J.TICS.2011.12.008
Article PubMed Google Scholar
Japkowicz, N., & Stephen, S. (2002). The class imbalance problem: A systematic study. Intelligent Data Analysis, 6(5), 429–449.
Article Google Scholar
Joensson, M., Thomsen, K. R., Andersen, L. M., Gross, J., Mouridsen, K., Sandberg, K., Østergaard, L., & Lou, H. C. (2015). Making sense: Dopamine activates conscious self-monitoring through medial prefrontal cortex. Human Brain Mapping, 36(5), 1866–1877. https://doi.org/10.1002/hbm.22742
Article PubMed PubMed Central Google Scholar
Kalman, R. E. (1960). A new approach to linear filtering and prediction theory. Transactions of the ASME-Journal of Basic Engineering, 82(Series D), 35–45.
Article Google Scholar
Knox, W. B., Otto, A. R., Stone, P., & Love, B. C. (2012). The nature of belief-directed exploratory choice in human decision-making. Frontiers in Psychology, 2:398. https://doi.org/10.3389/fpsyg.2011.00398
Kool, W., & Botvinick, M. (2018). Mental labour. In Nature Human Behaviour, 2(12), 899–908. https://doi.org/10.1038/s41562-018-0401-9. Nature Publishing Group.
Article Google Scholar
Krigolson, O. E. (2018). Event-related brain potentials and the study of reward processing: Methodological considerations. International Journal of Psychophysiology, 32(B), 175–183. https://doi.org/10.1016/j.ijpsycho.2017.11.007
Article Google Scholar
Lattimore, T., & Szepesvári, C. (2020). Bandit Algorithms. Cambridge University Press.
Book Google Scholar
Levene, H. (1960). Robust tests for equality of variances. In I. Olkin, et al. (Eds). Contributions to Probability and Statistics. (pp. 278–292). Stanford University Press.
Lewandowsky, S., & Farrell, S. (2011). Computational modeling in cognition: Principles and practice. SAGE Publications Inc.
Li, J., & Daw, N. D. (2011). Signals in human striatum are appropriate for policy update rather than value prediction. Journal of Neuroscience, 31(14), 5504–5511. https://doi.org/10.1523/JNEUROSCI.6316-10.2011
Article PubMed Google Scholar
Love, B. C., & Gureckis, T. M. (2007). Models in search of a brain. Cognitive, Affective, & Behavioral Neuroscience, 7(2), 90–108.
Article Google Scholar
Ludwig, T., Wu, C. M., & Schulz, E. (2022). Connecting exploration, generalization, and planning in correlated trees. Proceedings of the Annual Meeting of the Cognitive Science Society.
Marr, D. (1982). Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. MIT Press.
Meder, B., Wu, C. M., Schulz, E., & Ruggeri, A. (2021). Development of directed and random exploration in children. Developmental Science, 24(4). https://doi.org/10.1111/desc.13095
Mone, M. A., & Shalley, C. E. (1995). Effects of task complexity and goal specificity on change in strategy and performance over time. Human Performance, 8(4), 243–262. https://doi.org/10.1207/s15327043hup0804_1
Article Google Scholar
Neimark, E. D., & Shuford, E. H. (1959). Comparison of predictions and estimates in a probability learning situation. Journal of Experimental Psychology, 57(5), 294–298. https://doi.org/10.1037/h0043064
Article PubMed Google Scholar
Niv, Y. (2009). Reinforcement learning in the brain. Journal of Mathematical Psychology, 53(3), 139–154. https://doi.org/10.1016/J.JMP.2008.12.005
Article Google Scholar
Palminteri, S., Wyart, V., & Koechlin, E. (2017). The importance of falsification in computational cognitive modeling. In Trends in Cognitive Sciences, 21(6), 425–433. https://doi.org/10.1016/j.tics.2017.03.011. Elsevier Ltd.
Article Google Scholar
Payzan-LeNestour, É., & Bossaerts, P. (2012). Do not bet on the unknown versus try to find out more: estimation uncertainty and “unexpected uncertainty” both modulate exploration. Frontiers in Neuroscience, 6:150. https://doi.org/10.3389/fnins.2012.00150
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10(4), 437–442.
Article PubMed Google Scholar
R Core Team. (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Robbins, H. (1952). Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 58, 527–535.
Article Google Scholar
Saragosa-Harris, N. M., Cohen, A. O., Reneau, T. R., Villano, W. J., Heller, A. S., & Hartley, C. A. (2022). Real-world exploration increases across adolescence and relates to affect, risk taking, and social connectivity. Psychological Science, 33(10), 1664–1679. https://doi.org/10.1177/09567976221102070
Article PubMed Google Scholar
Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science (New York, N.Y.), 275(5306), 1593–1599. https://doi.org/10.1126/SCIENCE.275.5306.1593
Article PubMed Google Scholar
Schulz, E., Konstantinidis, E., & Speekenbrink, M. (2018a). Putting bandits into context: How function learning supports decision making. Journal of Experimental Psychology: Learning, Memory, and Cognition, 44(6), 927–943. https://doi.org/10.1101/081091
Article PubMed Google Scholar
Schulz, E., Wu, C. M., Huys, Q. J. M., Krause, A., & Speekenbrink, M. (2018b). Generalization and search in risky environments. Cognitive Science, 42(8), 2592–2620. https://doi.org/10.1111/cogs.12695
Article PubMed Google Scholar
Shahar, N., Moran, R., Hauser, T. U., Kievit, R. A., McNamee, D., Moutoussis, M., Nspn, C., & Dolan, R. J. (2019). Credit assignment to state-independent task representations and its relationship with model-based decision making. Proceedings of the National Academy of Sciences of the United States of America, 116(32), 15871–15876. https://doi.org/10.1073/pnas.1821647116
Article PubMed PubMed Central Google Scholar
Shields, G. S. (2020). Psychoneuroendocrinology Stress and cognition : A user’s guide to designing and interpreting studies. Psychoneuroendocrinology, 112, 104475. https://doi.org/10.1016/j.psyneuen.2019.104475
Article PubMed Google Scholar
Speekenbrink, M., & Konstantinidis, E. (2015). Uncertainty and exploration in a restless bandit problem. Topics in Cognitive Science, 7(2), 351–367. https://doi.org/10.1111/tops.12145
Article PubMed Google Scholar
Sripada, C. S. (2018). An exploration/exploitation trade-off between mind wandering and goal-directed thinking. In K. Christoff & K. C. R. Fox (Eds.), The Oxford handbook of spontaneous thought: Mind-wandering, creativity, and dreaming (pp. 23–34). Oxford University Press.
Google Scholar
Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd ed.). MIT Press.
Thompson, W. R. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3–4), 285–294.
Article Google Scholar
Umemoto, A., Inzlicht, M., & Holroyd, C. B. (2018). Electrophysiological indices of anterior cingulate cortex function reveal changing levels of cognitive effort and reward valuation that sustain task performance. Neuropsychologia. https://doi.org/10.1016/J.NEUROPSYCHOLOGIA.2018.06.010
Article PubMed Google Scholar
Williams, C. C., Ferguson, T. D., Hassall, C. D., Abimbola, W., & Krigolson, O. E. (2021). The ERP, frequency, and time–frequency correlates of feedback processing: Insights from a large sample study. Psychophysiology, 58(2), 1–26. https://doi.org/10.1111/psyp.13722
Article Google Scholar
Williams, R. J. (1992). Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Machine Learning, 8, 229–256.
Article Google Scholar
Wilson, R. C., & Collins, A. G. E. (2019). Ten simple rules for the computational modeling of behavioral data. ELife, 8(e49547), 1–33.
Google Scholar
Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A., & Cohen, J. D. (2014). Humans use directed and random exploration to solve the explore-exploit dilemma. Journal of Experimental Psychology: General, 143(6), 2074–2081. https://doi.org/10.1037/a0038199
Article PubMed Google Scholar
Worthy, D. A., Hawthorne, M. J., & Otto, A. R. (2013). Heterogeneity of strategy use in the Iowa gambling task: A comparison of win-stay/lose-shift and reinforcement learning models. Psychonomic Bulletin and Review, 20(2), 364–371. https://doi.org/10.3758/s13423-012-0324-9
Article PubMed Google Scholar
Wu, C. M., Schulz, E., Speekenbrink, M., Nelson, J. D., & Meder, B. (2018). Generalization guides human exploration in vast decision spaces. In Nature Human Behaviour, 2(12), 915–924. https://doi.org/10.1038/s41562-018-0467-4. Nature Publishing Group.
Article Google Scholar
Wu, C. M., Schulz, E., Garvert, M. M., Meder, B., & Schuck, N. W. (2020). Similarities and differences in spatial and nonspatial cognitive maps. PLOS Computational Biology, 16(10). https://doi.org/10.1371/JOURNAL.PCBI.1008149
Yechiam, E. (2020). Robust consistency of choice switching in decisions from experience. Judgment and Decision Making, 15(1), 74–81. https://doi.org/10.1017/s1930297500006914
Article Google Scholar
Yechiam, E., & Telpaz, A. (2013). Losses Induce Consistency in Risk Taking Even Without Loss Aversion. Journal of Behavioral Decision Making, 26(1), 31–40. https://doi.org/10.1002/bdm.758
Article Google Scholar
Yu, A. J., & Dayan, P. (2003). Expected and unexpected uncertainty: ACh and NE in the neocortex. Advances in Neural Information Processing Systems.
Yu, A. J., & Dayan, P. (2005). Uncertainty, neuromodulation, and attention. Neuron, 46(4), 681–692. https://doi.org/10.1016/j.neuron.2005.04.026
Article PubMed Google Scholar
Zajkowski, W. K., Kossut, M., & Wilson, R. C. (2017). A causal role for right frontopolar cortex in directed, but not random, exploration. ELife, 6(e27430), 1–18.
Google Scholar
Zhang, S., & Yu, A. J. (2013). Forgetful Bayes and myopic planning: Human learning and decision-making in a bandit setting. Advances in Neural Information Processing Systems, 26.

Download references

Funding

Thomas D. Ferguson would like to acknowledge support from the Dr. Roland and Muriel Haryett Neuroscience Fellowship and the Natural Sciences and Engineering Research Council of Canada. Alona Fyshe would like to acknowledge support from the Canadian Institute for Advanced Research (CIFAR) Canadian AI Chairs program. Adam White would like to acknowledge support from the CIFAR Canadian AI Chairs program. Olave E. Krigolson would like to acknowledge support from the Natural Sciences and Engineering Research Council of Canada (RGPIN 2016–0943). The authors declare that none of the funding sources mentioned above had any involvement in the design of the experiment or the preparation and submission of the manuscript.

Author information

Authors and Affiliations

Department of Computing Science, University of Alberta, Edmonton, AB, Canada
Thomas D. Ferguson, Alona Fyshe & Adam White
Alberta Machine Intelligence Institute, Edmonton, AB, Canada
Thomas D. Ferguson, Alona Fyshe & Adam White
Theoretical and Applied Neuroscience Laboratory, University of Victoria, Victoria, BC, Canada
Thomas D. Ferguson & Olave E. Krigolson
Department of Psychology, University of Alberta, Edmonton, AB, Canada
Alona Fyshe

Authors

Thomas D. Ferguson
View author publications
You can also search for this author in PubMed Google Scholar
Alona Fyshe
View author publications
You can also search for this author in PubMed Google Scholar
Adam White
View author publications
You can also search for this author in PubMed Google Scholar
Olave E. Krigolson
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Thomas Ferguson and Olave Krigolson contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Thomas Ferguson. Thomas Ferguson, Alona Fyshe, and Adam White contributed to the computational models used. The first draft of the manuscript was written by Thomas Ferguson and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Thomas D. Ferguson.

Ethics declarations

Ethics Approval

The Human Research Ethics Board at the University of Victoria approved all experimental procedures (Date: 25-Sep-2019; 19–0230), and all research was performed in line with the principles of the Declaration of Helsinki.

Consent to Participate

Participants provided written informed consent prior to the completion of the experimental session.

Consent to Publish

Participants provided consent to have aggregated data (averages) published in a research journal.

Competing Interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Correspondence should be directed to: Thomas D. Ferguson, Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada, T6G 2R3.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 970 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ferguson, T., Fyshe, A., White, A. et al. Humans Adopt Different Exploration Strategies Depending on the Environment. Comput Brain Behav 6, 671–696 (2023). https://doi.org/10.1007/s42113-023-00178-1

Download citation

Accepted: 16 July 2023
Published: 15 August 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s42113-023-00178-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Humans Adopt Different Exploration Strategies Depending on the Environment

Abstract

Access this article

Similar content being viewed by others

Active inference and the two-step task

Aversion to Option Loss in a Restless Bandit Task

Learning the value of information and reward over time when solving exploration-exploitation problems

Data Availability

Code Availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent to Participate

Consent to Publish

Competing Interests

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 970 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Humans Adopt Different Exploration Strategies Depending on the Environment

Abstract

Access this article

Similar content being viewed by others

Active inference and the two-step task

Aversion to Option Loss in a Restless Bandit Task

Learning the value of information and reward over time when solving exploration-exploitation problems

Data Availability

Code Availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent to Participate

Consent to Publish

Competing Interests

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 970 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation