AI Safety and Reproducibility: Establishing Robust Foundations for the Neuropsychology of Human Values

Sarma, Gopal P.; Hay, Nick J.; Safron, Adam

doi:10.1007/978-3-319-99229-7_45

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11094))

Included in the following conference series:

International Conference on Computer Safety, Reliability, and Security

4065 Accesses
2 Citations
2 Altmetric

Abstract

We propose the creation of a systematic effort to identify and replicate key findings in neuropsychology and allied fields related to understanding human values. Our aim is to ensure that research underpinning the value alignment problem of artificial intelligence has been sufficiently validated to play a role in the design of AI systems.

The views expressed herein are those of the author and do not necessarily reflect the views of Vicarious AI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ainley, V., et al.: ‘Bodily precision’: a predictive coding account of individual differences in interoceptive accuracy. Philos. Trans. R. Soc. B 371(1708) (2016)
Google Scholar
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D.: Concrete problems in AI safety. arXiv preprint arXiv:1606.06565 (2016)
Barrett, L.F.: How Emotions are Made: The Secret Life of the Brain. Houghton Mifflin Harcourt, Boston (2017)
Google Scholar
Baum, S.D.: Reconciliation between factions focused on near-term and long-term artificial intelligence. AI Soc. 1–8 (2017)
Google Scholar
Björnsdotter, M., Olausson, H.: Vicarious responses to social touch in posterior insular cortex are tuned to pleasant caressing speeds. J. Neurosci. 31(26), 9554–9562 (2011)
Article Google Scholar
Bostrom, N.: Superintelligence: Paths, Dangers, Strategies. Oxford University Press, Oxford (2014)
Google Scholar
Brown, B.B.: Delphi process: a methodology used for the elicitation of opinions of experts. Technical report, Rand Corp, Santa Monica, CA (1968)
Google Scholar
Campbell, P. (ed.): Challenges in Irreproducible Research, vol. 526. Nature, London (2015)
Google Scholar
Damasio, A.: Self Comes to Mind: Constructing the Conscious Brain. Vintage, New York (2012)
Google Scholar
Ekman, P., Friesen, W.V., Ellsworth, P.: Emotion in the Human Face: Guide-lines for Research and an Integration of Findings. Pergamon, Oxford (1972)
Google Scholar
Errington, T.M.: Science forum: an open investigation of the reproducibility of cancer biology research. eLife 3, 4333 (2014)
Article Google Scholar
Evans, O., Goodman, N.D.: Learning the preferences of bounded agents. In: NIPS Workshop on Bounded Optimality (2015)
Google Scholar
Evans, O., Stuhlmüller, A., Goodman, N.D.: Learning the Preferences of Ignorant, Inconsistent Agents. arXiv:1512.05832 (2015)
Grace, K., Salvatier, J., Dafoe, A., Zhang, B., Evans, O.: When Will AI Exceed Human Performance? Evidence from AI Experts. ArXiv e-prints, May 2017
Google Scholar
Harrington, A., Zajonc, A.: The Dalai Lama at MIT. Harvard University Press, Cambridge (2006)
Google Scholar
Horton, R.: What’s medicine’s 5 sigma?. Lancet 385(9976) (2015)
Google Scholar
LeDoux, J.E., Pine, D.S.: Using neuroscience to help understand fear and anxiety: a two-system framework. Am. J. Psychiatry 173(11), 1083–1093 (2016)
Article Google Scholar
Munafò, M.R.: A manifesto for reproducible science. Nat. Hum. Behav. 1, 0021 (2017)
Article Google Scholar
Panksepp, J.: Affective Neuroscience: The Foundations of Human and Animal Emotions. Oxford University Press, New York (1998)
Google Scholar
Porges, S.W., Furman, S.A.: The early development of the autonomic nervous system provides a neural platform for social behaviour: a polyvagal perspective. Infant Child Dev. 20(1), 106–118 (2011)
Article Google Scholar
Russell, S.: Should we fear supersmart robots? Sci. Am. 314(6), 58–59 (2016)
Article Google Scholar
Sarma, G.P.: Doing Things Twice (Or Differently): Strategies to Identify Studies for Targeted Validation. ArXiv e-prints, Mar 2017
Google Scholar
Sarma, G.P., Hay, N.J.: Mammalian value systems. Informatica 41(3) (2017)
Google Scholar
Solms, M., Turnbull, O.: The Brain and the Inner World: An Introduction to the Neuroscience of Subjective Experience. Karnac Books, London (2002)
Google Scholar
Sotala, K.: Defining human values for value learners. In: AAAI Workshop: AI, Ethics, and Society (2016)
Google Scholar
Sullins, J.P.: When is a robot a moral agent?. Int. Rev. Inf. Ethics 6 (2006)
Google Scholar
The Open Science Collaboration: Estimating the reproducibility of psychological science. Science 349(6251) (2015)
Google Scholar
Tomasello, M.: The Cultural Origins of Human Cognition. Harvard University Press, Cambridge (1999)
Google Scholar
Wachter, S., Mittelstadt, B., Floridi, L.: Transparent, explainable, and accountable AI for robotics. Sci. Robot. 2 (2006)
Google Scholar
Wortham, R.H., Theodorou, A., Bryson, J.J.: What does the robot think? Transparency as a fundamental design requirement for intelligent systems. In: IJCAI 2016 Ethics for AI Workshop (2016)
Google Scholar

Download references

Acknowledgements

We would like to thank Owain Evans and several anonymous reviewers for insightful discussions on the topics of value alignment and reproducibility in psychology and neuroscience.

Author information

Authors and Affiliations

School of Medicine, Emory University, Atlanta, GA, USA
Gopal P. Sarma
Vicarious AI, San Francisco, CA, USA
Nick J. Hay
Department of Psychology, Northwestern University, Evanston, IL, USA
Adam Safron

Authors

Gopal P. Sarma
View author publications
You can also search for this author in PubMed Google Scholar
Nick J. Hay
View author publications
You can also search for this author in PubMed Google Scholar
Adam Safron
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gopal P. Sarma .

Editor information

Editors and Affiliations

Mälardalen University, Västerås, Sweden
Barbara Gallina
Norwegian University of Science and Technology, Trondheim, Norway
Amund Skavhaug
AIT Austrian Institute of Technology, Vienna, Austria
Erwin Schoitsch
Thales Deutschland GmbH, Ditzingen, Germany
Friedemann Bitsch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sarma, G.P., Hay, N.J., Safron, A. (2018). AI Safety and Reproducibility: Establishing Robust Foundations for the Neuropsychology of Human Values. In: Gallina, B., Skavhaug, A., Schoitsch, E., Bitsch, F. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2018. Lecture Notes in Computer Science(), vol 11094. Springer, Cham. https://doi.org/10.1007/978-3-319-99229-7_45

Download citation

DOI: https://doi.org/10.1007/978-3-319-99229-7_45
Published: 21 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99228-0
Online ISBN: 978-3-319-99229-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics