AI Safety and Reproducibility: Establishing Robust Foundations for the Neuropsychology of Human Values

  • Gopal P. SarmaEmail author
  • Nick J. Hay
  • Adam Safron
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11094)


We propose the creation of a systematic effort to identify and replicate key findings in neuropsychology and allied fields related to understanding human values. Our aim is to ensure that research underpinning the value alignment problem of artificial intelligence has been sufficiently validated to play a role in the design of AI systems.


Affective neuroscience Social neuroscience Human values 



We would like to thank Owain Evans and several anonymous reviewers for insightful discussions on the topics of value alignment and reproducibility in psychology and neuroscience.


  1. 1.
    Ainley, V., et al.: ‘Bodily precision’: a predictive coding account of individual differences in interoceptive accuracy. Philos. Trans. R. Soc. B 371(1708) (2016)Google Scholar
  2. 2.
    Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D.: Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016)
  3. 3.
    Barrett, L.F.: How Emotions are Made: The Secret Life of the Brain. Houghton Mifflin Harcourt, Boston (2017)Google Scholar
  4. 4.
    Baum, S.D.: Reconciliation between factions focused on near-term and long-term artificial intelligence. AI Soc. 1–8 (2017)Google Scholar
  5. 5.
    Björnsdotter, M., Olausson, H.: Vicarious responses to social touch in posterior insular cortex are tuned to pleasant caressing speeds. J. Neurosci. 31(26), 9554–9562 (2011)CrossRefGoogle Scholar
  6. 6.
    Bostrom, N.: Superintelligence: Paths, Dangers, Strategies. Oxford University Press, Oxford (2014)Google Scholar
  7. 7.
    Brown, B.B.: Delphi process: a methodology used for the elicitation of opinions of experts. Technical report, Rand Corp, Santa Monica, CA (1968)Google Scholar
  8. 8.
    Campbell, P. (ed.): Challenges in Irreproducible Research, vol. 526. Nature, London (2015)Google Scholar
  9. 9.
    Damasio, A.: Self Comes to Mind: Constructing the Conscious Brain. Vintage, New York (2012)Google Scholar
  10. 10.
    Ekman, P., Friesen, W.V., Ellsworth, P.: Emotion in the Human Face: Guide-lines for Research and an Integration of Findings. Pergamon, Oxford (1972)Google Scholar
  11. 11.
    Errington, T.M.: Science forum: an open investigation of the reproducibility of cancer biology research. eLife 3, 4333 (2014)CrossRefGoogle Scholar
  12. 12.
    Evans, O., Goodman, N.D.: Learning the preferences of bounded agents. In: NIPS Workshop on Bounded Optimality (2015)Google Scholar
  13. 13.
    Evans, O., Stuhlmüller, A., Goodman, N.D.: Learning the Preferences of Ignorant, Inconsistent Agents. arXiv:1512.05832 (2015)
  14. 14.
    Grace, K., Salvatier, J., Dafoe, A., Zhang, B., Evans, O.: When Will AI Exceed Human Performance? Evidence from AI Experts. ArXiv e-prints, May 2017Google Scholar
  15. 15.
    Harrington, A., Zajonc, A.: The Dalai Lama at MIT. Harvard University Press, Cambridge (2006)Google Scholar
  16. 16.
    Horton, R.: What’s medicine’s 5 sigma?. Lancet 385(9976) (2015)Google Scholar
  17. 17.
    LeDoux, J.E., Pine, D.S.: Using neuroscience to help understand fear and anxiety: a two-system framework. Am. J. Psychiatry 173(11), 1083–1093 (2016)CrossRefGoogle Scholar
  18. 18.
    Munafò, M.R.: A manifesto for reproducible science. Nat. Hum. Behav. 1, 0021 (2017)CrossRefGoogle Scholar
  19. 19.
    Panksepp, J.: Affective Neuroscience: The Foundations of Human and Animal Emotions. Oxford University Press, New York (1998)Google Scholar
  20. 20.
    Porges, S.W., Furman, S.A.: The early development of the autonomic nervous system provides a neural platform for social behaviour: a polyvagal perspective. Infant Child Dev. 20(1), 106–118 (2011)CrossRefGoogle Scholar
  21. 21.
    Russell, S.: Should we fear supersmart robots? Sci. Am. 314(6), 58–59 (2016)CrossRefGoogle Scholar
  22. 22.
    Sarma, G.P.: Doing Things Twice (Or Differently): Strategies to Identify Studies for Targeted Validation. ArXiv e-prints, Mar 2017Google Scholar
  23. 23.
    Sarma, G.P., Hay, N.J.: Mammalian value systems. Informatica 41(3) (2017)Google Scholar
  24. 24.
    Solms, M., Turnbull, O.: The Brain and the Inner World: An Introduction to the Neuroscience of Subjective Experience. Karnac Books, London (2002)Google Scholar
  25. 25.
    Sotala, K.: Defining human values for value learners. In: AAAI Workshop: AI, Ethics, and Society (2016)Google Scholar
  26. 26.
    Sullins, J.P.: When is a robot a moral agent?. Int. Rev. Inf. Ethics 6 (2006)Google Scholar
  27. 27.
    The Open Science Collaboration: Estimating the reproducibility of psychological science. Science 349(6251) (2015)Google Scholar
  28. 28.
    Tomasello, M.: The Cultural Origins of Human Cognition. Harvard University Press, Cambridge (1999)Google Scholar
  29. 29.
    Wachter, S., Mittelstadt, B., Floridi, L.: Transparent, explainable, and accountable AI for robotics. Sci. Robot. 2 (2006)Google Scholar
  30. 30.
    Wortham, R.H., Theodorou, A., Bryson, J.J.: What does the robot think? Transparency as a fundamental design requirement for intelligent systems. In: IJCAI 2016 Ethics for AI Workshop (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of MedicineEmory UniversityAtlantaUSA
  2. 2.Vicarious AISan FranciscoUSA
  3. 3.Department of PsychologyNorthwestern UniversityEvanstonUSA

Personalised recommendations