Ethics and Information Technology

, Volume 20, Issue 1, pp 59–69 | Cite as

The “big red button” is too late: an alternative model for the ethical evaluation of AI systems

  • Thomas Arnold
  • Matthias Scheutz
Original Paper


As a way to address both ominous and ordinary threats of artificial intelligence (AI), researchers have started proposing ways to stop an AI system before it has a chance to escape outside control and cause harm. A so-called “big red button” would enable human operators to interrupt or divert a system while preventing the system from learning that such an intervention is a threat. Though an emergency button for AI seems to make intuitive sense, that approach ultimately concentrates on the point when a system has already “gone rogue” and seeks to obstruct interference. A better approach would be to make ongoing self-evaluation and testing an integral part of a system’s operation, diagnose how the system is in error and to prevent chaos and risk before they start. In this paper, we describe the demands that recent big red button proposals have not addressed, and we offer a preliminary model of an approach that could better meet them. We argue for an ethical core (EC) that consists of a scenario-generation mechanism and a simulation environment that are used to test a system’s decisions in simulated worlds, rather than the real world. This EC would be kept opaque to the system itself: through careful design of memory and the character of the scenario, the system’s algorithms would be prevented from learning about its operation and its function, and ultimately its presence. By monitoring and checking for deviant behavior, we conclude, a continual testing approach will be far more effective, responsive, and vigilant toward a system’s learning and action in the world than an emergency button which one might not get to push in time.


Artificial intelligence Ethics Computational architecture 



Funding was provided by Office of Naval Research (Grant No. ONR #N00014-16-1-2278).


  1. Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. Preprint at arXiv:1606.06565. Retrieved from
  2. Ananny, M., & Crawford, K. (2016). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media and Society. Retrieved from
  3. Anderson, M., & Anderson, S. L. (2007). Machine ethics: Creating an ethical intelligent agent. AI Magazine, 28(4), 15.Google Scholar
  4. Arnold, T., Kasenberg, D., & Scheutz, M. (2017). Value alignment or misalignment: What will keep systems accountable? AAAI Ethics Workshop. Retrieved from aaai17-alignment.pdf.
  5. Arnold, T., & Scheutz, M. (2016). Against the moral turing test: Accountable design and the moral reasoning of autonomous systems. Ethics and Information Technology, 18(2), 103–115.CrossRefGoogle Scholar
  6. Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford: Oxford University Press.Google Scholar
  7. Boyd, D. (2016). Transparency ≠ accountability. Data & Society: Points. Retrieved from
  8. Briggs, G., & Scheutz, M. (2016). Why robots must learn to tell us no. Scientific American. Retrieved from no-rdquo/.
  9. Bringsjord, S., Arkoudas, K., & Bello, P. (2006). Toward a general logicist methodology for engineering ethically correct robots. IEEE Intelligent Systems, 21(4), 38–44.CrossRefGoogle Scholar
  10. Crawford, K., & Calo, R. (2016). There is a blind spot in AI research. Nature, 538(7625), 311.CrossRefGoogle Scholar
  11. Cuthbertson, A. (2016) Google’s “big red button” could save the world from AI. (2016). Newsweek. Retrieved from intelligence-save-world-elon-musk-466753.
  12. Davies, A. (2016). Surprise! Uber refuses to stop self-driving cars in SF. Wired. Retrieved from showdown/.
  13. Devlin, H. (2016). Discrimination by algorithm: Scientists devise test to detect AI bias. The Guardian. Retrieved from discrimination-by-algorithm-scientists-devise-test-to-detect-ai-bias.
  14. Ewing, J. (2015). Volkswagen says 11 million cars worldwide are affected in diesel deception. The New York Times. Retrieved from
  15. Gershgorn, D. (2016). We don’t understand how AI make most decisions, so now algorithms are explaining themselves. Quartz. Retrieved from understand-how-ai-make-most-decisions-so-now-algorithms-are-explaining-themselves/.
  16. Hadfield-Menell, D., Dragan, A., Abbeel, P., & Russell, S. (2016). The off-switch game. Preprint at arXiv:1611.08219. Retrieved from
  17. Hardesty, L. (2016). Making computers explain themselves. MIT News. Retrieved from
  18. Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. Advances In Neural Information Processing Systems (pp. 3315–3323).Google Scholar
  19. Lei, T., Barzilay, R., & Jaakkola, T. (2016). Rationalizing neural predictions. Retrieved from
  20. Malle, B. F., & Scheutz, M. (2014). Moral competence in social robots. Ethics in Science, Technology and Engineering, 2014 IEEE International Symposium (pp. 1–6). IEEE.Google Scholar
  21. Malle, B. F., Scheutz, M., Arnold, T., Voiklis, J., & Cusimano, C. (2015). Sacrifice one for the good of many? People apply different moral norms to human and robot agents. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (pp. 117–124). ACM.Google Scholar
  22. Moor, J. H. (2006). The nature, importance, and difficulty of machine ethics. IEEE Intelligent Systems, 21(4), 18–21.CrossRefGoogle Scholar
  23. Neff, G., & Nagy, P. (2016). Automation, algorithms, and politics| talking to bots: symbiotic agency and the case of tay. International Journal of Communication, 10, 17.Google Scholar
  24. Orseau, L., & Armstrong, S. (2016). Safely interruptible agents. Retrieved from
  25. Park, D. H., Hendricks, L. A., Akata, Z., Schiele, B., Darrell, T., & Rohrbach, M. (2016). Attentive explanations: Justifying decisions and pointing to the evidence. Preprint at arXiv:1612.04757. Retrieved from
  26. Riedl, M. (2016). Big red button. Retrieved from
  27. Sample, I. (2017). Give robots an ‘ethical black box’ to track and explain decisions, say scientists. The Guardian. Retrieved from give-robots-an-ethical-black-box-to-track-and-explain-decisions-say-scientists.
  28. Satell, G. (2016). Teaching an algorithm to understand right and wrong. Harvard Business Review. Retrieved from right-and-wrong.
  29. Scheutz, M. (2014). The need for moral competency in autonomous agent architectures. In Fundamental issues of artificial intelligence (pp. 515–525). New York: Springer.Google Scholar
  30. Wallach, W., & Allen, C. (2008). Moral machines: Teaching robots right from wrong. Oxford: Oxford University Press.Google Scholar
  31. White House. (2016). The Administrations report on the future of artificial intelligence. Retrieved from artificial-intelligence.
  32. Wiedenmeier, B. (2016). A warning to people who bike: Self-driving Ubers and right hook turns | San Francisco bicycle coalition. Retrieved from

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Human-Robot Interaction Laboratory, Department of Computer ScienceTufts UniversityMedfordUSA

Personalised recommendations