Neurocomputational mechanisms of adaptive learning in social exchanges

  • Polina M. Vanyukov
  • Michael N. Hallquist
  • Mauricio Delgado
  • Katalin Szanto
  • Alexandre Y. DombrovskiEmail author


Prior work on prosocial and self-serving behavior in human economic exchanges has shown that counterparts’ high social reputations bias striatal reward signals and elicit cooperation, even when such cooperation is disadvantageous. This phenomenon suggests that the human striatum is modulated by the other’s social value, which is insensitive to the individual’s own choices to cooperate or defect. We tested an alternative hypothesis that, when people learn from their interactions with others, they encode prediction error updates with respect to their own policy. Under this policy update account striatal signals would reflect positive prediction errors when the individual’s choices correctly anticipated not only the counterpart’s cooperation but also defection. We examined behavior in three samples using reinforcement learning and model-free analyses and performed an fMRI study of striatal learning signals. In order to uncover the dynamics of goal-directed learning, we introduced reversals in the counterpart’s behavior and provided counterfactual (would-be) feedback when the individual chose not to engage with the counterpart. Behavioral data and model-derived prediction error maps (in both whole-brain and a priori striatal region of interest analyses) supported the policy update model. Thus, as people continually adjust their rate of cooperation based on experience, their behavior and striatal learning signals reveal a self-centered instrumental process corresponding to reciprocal altruism.


Reinforcement learning Social decision-making Counterfactual representations Trust Striatum Prediction error Behavior, cooperative 



This research was supported by the National Institutes of Mental Health (R01MH085651 to K.S.; R01MH100095 to A.Y.D.; K01MH097091 to M.N.H) and the American Foundation for Suicide Prevention (Young Investigator Grant to P.M.V). The authors thank Jonathan Wilson for assistance with data processing and analysis, Mandy Collier and Michelle Perry for assistance with data collection, and Laura Kenneally for assistance with the manuscript. The authors declare no competing financial interests.

Supplementary material

13415_2019_697_MOESM1_ESM.docx (3.8 mb)
ESM 1 (DOCX 3901 kb)


  1. Apps, M. A., Rushworth, M. F., & Chang, S. W. (2016). The Anterior Cingulate Gyrus and Social Cognition: Tracking the Motivation of Others. Neuron, 90(4), 692–707.Google Scholar
  2. Behrens, T. E. J., Hunt, L. T., Woolrich, M. W., & Rushworth, M. F. S. (2008). Associative learning of social value. Nature, 456(7219), 245–249. Google Scholar
  3. Bhanji, J. P., & Delgado, M. R. (2014). The social brain and reward: Social information processing in the human striatum. Wiley Interdisciplinary Reviews: Cognitive Science, 5(1), 61-73.Google Scholar
  4. Boorman, E. D., Behrens, T. E., & Rushworth, M. F. (2011). Counterfactual choice and learning in a neural network centered on human lateral frontopolar cortex. PLOS Biology, 9(6), e1001093.Google Scholar
  5. Boorman, E. D., Behrens, T. E., Woolrich, M. W., & Rushworth, M. F. (2009). How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron, 62(5), 733–743.Google Scholar
  6. Bray, S., & O’Doherty, J. (2007). Neural coding of reward-prediction error signals during classical conditioning with attractive faces. Journal of Neurophysiology, 97(4), 3036–3045. Google Scholar
  7. Brown, S. B., & Ridderinkhof, K. R. (2009). Aging and the neuroeconomics of decision making: A review. Cognitive, Affective, & Behavioral Neuroscience, 9(4), 365–379.Google Scholar
  8. Camerer, C. F. (2003). Behavioural studies of strategic thinking in games. Trends in Cognitive Sciences, 7(5), 225–231. Google Scholar
  9. Camille, N., Coricelli, G., Sallet, J., Pradat-Diehl, P., Duhamel, J.-R., & Sirigu, A. (2004). The involvement of the orbitofrontal cortex in the experience of regret. Science, 304(5674), 1167–1170. Google Scholar
  10. Chase, H. W., Kumar, P., Eickhoff, S. B., & Dombrovski, A. Y. (2015). Reinforcement learning models and their neural correlates: An activation likelihood estimation meta-analysis. Cognitive, Affective, & Behavioral Neuroscience, 5(2), 435–459. Google Scholar
  11. Chiu, P. H., Lohrenz, T. M., & Montague, P. R. (2008). Smokers’ brains compute, but ignore, a fictive error signal in a sequential investment task. Nature Neuroscience, 11(4), 514–520.Google Scholar
  12. Coricelli, G., Dolan, R. J., & Sirigu, A. (2007). Brain, emotion and decision making: The paradigmatic example of regret. Trends in Cognitive Sciences, 11(6), 258–265. Google Scholar
  13. Cox, R. W. (1996). AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Computational Biomedical Research, 29(3), 162–173.Google Scholar
  14. Daunizeau, J., Adam, V., & Rigoux, L. (2014). VBA: A probabilistic treatment of nonlinear models for neurobiological and behavioural data. PLOS Computational Biology, 10(1), e1003441. Google Scholar
  15. Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P., & Dolan, R. J. (2011). Model-based influences on humans’ choices and striatal prediction errors. Neuron, 69(6), 1204–1215.Google Scholar
  16. Delgado, M. R., Frank, R. H., & Phelps, E. A. (2005). Perceptions of moral character modulate the neural systems of reward during the trust game. Nature Neuroscience, 8(11), 1611–1618.Google Scholar
  17. Dombrovski, A. Y., Clark, L., Siegle, G. J., Butters, M. A., Ichikawa, N., Sahakian, B. J., & Szanto, K. (2010). Reward/punishment reversal learning in older suicide attempters. American Journal of Psychiatry, 167(6), 699–707. Google Scholar
  18. Fareri, D. S., Chang, L. J., & Delgado, M. R. (2012). Effects of direct social experience on trust decisions and neural reward circuitry. Frontiers in Neuroscience, 6, 1–17.Google Scholar
  19. Fareri, D. S., Chang, L. J., & Delgado, M. R. (2015). Computational substrates of social value in interpersonal collaboration. The Journal of Neuroscience, 35(21), 8170–8180. Google Scholar
  20. Fonov, V. S., Evans, A. C., McKinstry, R. C., Almli, C. R., & Collins, D. L. (2009). Unbiased nonlinear average age-appropriate brain templates from birth to adulthood. NeuroImage, (47), S102.Google Scholar
  21. Fouragnan, E., Chierchia, G., Greiner, S., Neveau, R., Avesani, P. & Coricelli, G. (2013). Reputational priors magnify striatal responses to violations of trust. The Journal of Neuroscience, 33(8), 3602–3611.Google Scholar
  22. Greve, D. N., & Fischl, B. (2009). Accurate and robust brain image alignment using boundary-based registration. NeuroImage, 48(1), 63–72.Google Scholar
  23. Haroush, K., & Williams, Z. M. (2015). Neuronal prediction of opponent’s behavior during cooperative social interchange in primates. Cell, 160(6), 1233–1245. Google Scholar
  24. Kahnt, T., Park, S. Q., Cohen, M. X., Beck, A., Heinz, A., & Wrase, J. (2008). Dorsal striatal–midbrain connectivity in humans predicts how reinforcements are used to guide decisions. Journal of Cognitive Neuroscience, 21(7), 1332–1345. Google Scholar
  25. King-Casas, B., Tomlin, D., Anen, C., Camerer, C. F., Quartz, S. R., & Montague, P. R. (2005). Getting to know you: Reputation and trust in a two-person economic exchange. Science, 308(5718), 78–83. Google Scholar
  26. Klein, T. A., Neumann, J., Reuter, M., Hennig, J., von Cramon, D. Y., & Ullsperger, M. (2007). Genetically determined differences in learning from errors. Science, 318(5856), 1642–1645. Google Scholar
  27. Lebreton, M., & Palminteri, S. (2016). Revisiting the assessment of inter-individual differences in fMRI activations-behavior relationships. BioRxiv.
  28. Lohrenz, T., McCabe, K., Camerer, C. F., & Montague, P. R. (2007). Neural signature of fictive learning signals in a sequential investment task. Proceedings of the National Academy of Sciences, 104(22), 9493–9498.Google Scholar
  29. Loomes, G., & Sugden, R. (1982). Regret theory: An alternative theory of rational choice under uncertainty. The Economic Journal, 92(368), 805–824.Google Scholar
  30. Millman, K. J., & Brett, M. (2007). Analysis of functional magnetic resonance imaging in Python. Computing in Science & Engineering, 9(3), 52–55.Google Scholar
  31. Nicolle, A., Bach, D. R., Driver, J., & Dolan, R. J. (2011). A role for the striatum in regret-related choice repetition. Journal of Cognitive Neuroscience, 23(4), 845–856.Google Scholar
  32. Noonan, M. P., Walton, M. E., Behrens, T. E., Sallet, J., Buckley, M. J., & Rushworth, M. F. (2010). Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex. Proceedings of the National Academy of Sciences of the United States of America, 107(47), 20547–20552. Google Scholar
  33. Park, S. Q., Kahnt, T., Beck, A., Cohen, M. X., Dolan, R. J., Wrase, J., & Heinz, A. (2010). Prefrontal cortex fails to learn from reward prediction errors in alcohol dependence. Journal of Neuroscience, 30(22), 7749–7753. Google Scholar
  34. Rigoux, L., Stephan, K. E., Friston, K. J., & Daunizeau, J. (2014). Bayesian model selection for group studies—Revisited. NeuroImage, 84, 971–985. Google Scholar
  35. Robinson, O. J., Overstreet, C., Charney, D. R., Vytal, K., & Grillon, C. (2013). Stress increases aversive prediction error signal in the ventral striatum. Proceedings of the National Academy of Sciences of the United States of America, 110(10), 4129–4133. Google Scholar
  36. Roche, A. (2011). A four-dimensional registration algorithm with application to joint correction of motion and slice timing in fMRI. IEEE Transactions on Medical Imaging, 30(8), 1546–1554.Google Scholar
  37. Samejima, K., Ueda, Y., Doya, K., & Kimura, M. (2005). Representation of action-specific reward values in the striatum. Science, 310(5752), 1337–1340. Google Scholar
  38. Schlagenhauf, F., Rapp, M. A., Huys, Q. J., Beck, A., Wustenberg, T., Deserno, L., … Heinz, A. (2012). Ventral striatal prediction error signaling is associated with dopamine synthesis capacity and fluid intelligence. Human Brain Mapping, 34(6), 1490–1499. Google Scholar
  39. Smith, S. M., Jenkinson, M., Woolrich, M. W., Beckmann, C. F., Behrens, T. E., Johansen-Berg, H., … Niazy, R. K. (2004). Advances in functional and structural MR image analysis and implementation as FSL. NeuroImage, 23, S208–S219.Google Scholar
  40. Stephan, K. E., Penny, W. D., Daunizeau, J., Moran, R. J., & Friston, K. J. (2009). Bayesian model selection for group studies. NeuroImage, 46(4), 1004–1017. Google Scholar
  41. Tottenham, N., Tanaka, J. W., Leon, A. C., McCarry, T., Nurse, M., Hare, T. A., … Nelson, C. (2009). The NimStim set of facial expressions: Judgments from untrained research participants. Psychiatry Research, 168(3), 242–249.Google Scholar
  42. van den Bos, W., van Dijk, E., Westenberg, M., Rombouts, S. A. R. B., & Crone, E. A. (2009). What motivates repayment? Neural correlates of reciprocity in the trust game. Social Cognitive and Affective Neuroscience, 4(3), 294–304. Google Scholar
  43. Walton, M. E., Behrens, T. E., Buckley, M. J., Rudebeck, P. H., & Rushworth, M. F. (2010). Separable learning systems in the macaque brain and the role of orbitofrontal cortex in contingent learning. Neuron, 65(6), 927–939.Google Scholar
  44. Wilson, R. C., & Niv, Y. (2015). Is model fitting necessary for model-based fMRI? PLOS Computational Biology, 11(6), e1004237. Google Scholar

Copyright information

© The Psychonomic Society, Inc. 2019

Authors and Affiliations

  • Polina M. Vanyukov
    • 1
  • Michael N. Hallquist
    • 2
  • Mauricio Delgado
    • 3
  • Katalin Szanto
    • 1
  • Alexandre Y. Dombrovski
    • 1
    Email author
  1. 1.Department of PsychiatryUniversity of Pittsburgh School of MedicinePittsburghUSA
  2. 2.Department of PsychologyPennsylvania State UniversityState CollegeUSA
  3. 3.Department of PsychologyRutgers UniversityNewarkUSA

Personalised recommendations