Skip to main content
Log in

State-transition-free reinforcement learning in chimpanzees (Pan troglodytes)

  • Published:
Learning & Behavior Aims and scope Submit manuscript

Abstract

The outcome of an action often occurs after a delay. One solution for learning appropriate actions from delayed outcomes is to rely on a chain of state transitions. Another solution, which does not rest on state transitions, is to use an eligibility trace (ET) that directly bridges a current outcome and multiple past actions via transient memories. Previous studies revealed that humans (Homo sapiens) learned appropriate actions in a behavioral task in which solutions based on the ET were effective but transition-based solutions were ineffective. This suggests that ET may be used in human learning systems. However, no studies have examined nonhuman animals with an equivalent behavioral task. We designed a task for nonhuman animals following a previous human study. In each trial, participants chose one of two stimuli that were randomly selected from three stimulus types: a stimulus associated with a food reward delivered immediately, a stimulus associated with a reward delivered after a few trials, and a stimulus associated with no reward. The presented stimuli did not vary according to the participants’ choices. To maximize the total reward, participants had to learn the value of the stimulus associated with a delayed reward. Five chimpanzees (Pan troglodytes) performed the task using a touchscreen. Two chimpanzees were able to learn successfully, indicating that learning mechanisms that do not depend on state transitions were involved in the learning processes. The current study extends previous ET research by proposing a behavioral task and providing empirical data from chimpanzees.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

Data and materials are available upon reasonable request.

Code availability

Codes are available upon reasonable request.

References

Download references

Acknowledgements

We thank the staff and researchers at Kumamoto Sanctuary for their help with the study, particularly Dr. N. Morimura and Dr. F. Kano. We thank Benjamin Knight, MSc., from Edanz (https://jp.edanz.com/ac) for editing a draft of this manuscript.

Funding

This study was supported financially by the Ministry of Education, Culture, Sports, Science, Japan Society for the Promotion of Science to YSato (grant number 19J22889), to SH (grant numbers 26245069, 18H05524, 23H00494), and to TM (grant number 16H06283); a Program for Leading Graduate Schools to TM (U04); and the Great Ape Information Network.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Yutaro Sato, Yutaka Sakai, Satoshi Hirata. Methodology: Yutaro Sato, Yutaka Sakai. Formal analysis, Investigation, Data Curation and Visualization: Yutaro Sato. Writing–Original Draft, Yutaro Sato, Yutaka Sakai. Resources: Yutaro Sato, Satoshi Hirata. Writing–Review and Editing: Satoshi Hirata. Supervision: Satoshi Hirata. Project administration: Satoshi Hirata. Funding acquisition: Yutaro Sato, Satoshi Hirata.

Corresponding author

Correspondence to Yutaro Sato.

Ethics declarations

Ethics approval

Animal husbandry and research protocols complied with the Guide for Animal Research Ethics provided by the Wildlife Research Center, Kyoto University (No. WRC-2020-KS006A). For human participants (Online Supplementary Materials (OSM)), the research protocol was approved by the Ethics Committee of the Unit for Advanced Study of Mind at Kyoto University (2-P-16).

Consent to participate

Informed consent was obtained from all individual human participants included in the study (OSM).

Consent for publication

Human participants (OSM) signed informed consent that included publishing their data.

Conflict of interest

The authors have no known conflicts of interest to disclose.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open practices statements

None of the data or materials for the experiments reported here have been deposited online, and none of the experiments was preregistered.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 537 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sato, Y., Sakai, Y. & Hirata, S. State-transition-free reinforcement learning in chimpanzees (Pan troglodytes). Learn Behav 51, 413–427 (2023). https://doi.org/10.3758/s13420-023-00591-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3758/s13420-023-00591-3

Keywords

Navigation