Abstract
As a way to address both ominous and ordinary threats of artificial intelligence (AI), researchers have started proposing ways to stop an AI system before it has a chance to escape outside control and cause harm. A so-called “big red button” would enable human operators to interrupt or divert a system while preventing the system from learning that such an intervention is a threat. Though an emergency button for AI seems to make intuitive sense, that approach ultimately concentrates on the point when a system has already “gone rogue” and seeks to obstruct interference. A better approach would be to make ongoing self-evaluation and testing an integral part of a system’s operation, diagnose how the system is in error and to prevent chaos and risk before they start. In this paper, we describe the demands that recent big red button proposals have not addressed, and we offer a preliminary model of an approach that could better meet them. We argue for an ethical core (EC) that consists of a scenario-generation mechanism and a simulation environment that are used to test a system’s decisions in simulated worlds, rather than the real world. This EC would be kept opaque to the system itself: through careful design of memory and the character of the scenario, the system’s algorithms would be prevented from learning about its operation and its function, and ultimately its presence. By monitoring and checking for deviant behavior, we conclude, a continual testing approach will be far more effective, responsive, and vigilant toward a system’s learning and action in the world than an emergency button which one might not get to push in time.
This is a preview of subscription content,
to check access.

References
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. Preprint at arXiv:1606.06565. Retrieved from https://arXiv.org/abs/1606.06565v2.
Ananny, M., & Crawford, K. (2016). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media and Society. Retrieved from http://journals.sagepub.com/doi/full/10.1177/1461444816676645.
Anderson, M., & Anderson, S. L. (2007). Machine ethics: Creating an ethical intelligent agent. AI Magazine, 28(4), 15.
Arnold, T., Kasenberg, D., & Scheutz, M. (2017). Value alignment or misalignment: What will keep systems accountable? AAAI Ethics Workshop. Retrieved from https://hrilab.tufts.edu/publications/ aaai17-alignment.pdf.
Arnold, T., & Scheutz, M. (2016). Against the moral turing test: Accountable design and the moral reasoning of autonomous systems. Ethics and Information Technology, 18(2), 103–115.
Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford: Oxford University Press.
Boyd, D. (2016). Transparency ≠ accountability. Data & Society: Points. Retrieved from https://points.datasociety.net/transparency-accountability-3c04e4804504#.t8eg2c1fl.
Briggs, G., & Scheutz, M. (2016). Why robots must learn to tell us no. Scientific American. Retrieved from https://www.scientificamerican.com/article/why-robots-must-learn-to-tell-us-ldquo- no-rdquo/.
Bringsjord, S., Arkoudas, K., & Bello, P. (2006). Toward a general logicist methodology for engineering ethically correct robots. IEEE Intelligent Systems, 21(4), 38–44.
Crawford, K., & Calo, R. (2016). There is a blind spot in AI research. Nature, 538(7625), 311.
Cuthbertson, A. (2016) Google’s “big red button” could save the world from AI. (2016). Newsweek. Retrieved from http://www.newsweek.com/google-big-red-button-ai-artificial- intelligence-save-world-elon-musk-466753.
Davies, A. (2016). Surprise! Uber refuses to stop self-driving cars in SF. Wired. Retrieved from https://www.wired.com/2016/12/uber-refuses-stop-self-driving-sf-setting-legal- showdown/.
Devlin, H. (2016). Discrimination by algorithm: Scientists devise test to detect AI bias. The Guardian. Retrieved from https://www.theguardian.com/technology/2016/dec/19/ discrimination-by-algorithm-scientists-devise-test-to-detect-ai-bias.
Ewing, J. (2015). Volkswagen says 11 million cars worldwide are affected in diesel deception. The New York Times. Retrieved from http://www.nytimes.com/2015/09/23/business/international/volkswagen-diesel-car-scandal.html?_r=0.
Gershgorn, D. (2016). We don’t understand how AI make most decisions, so now algorithms are explaining themselves. Quartz. Retrieved from http://qz.com/865357/we-dont- understand-how-ai-make-most-decisions-so-now-algorithms-are-explaining-themselves/.
Hadfield-Menell, D., Dragan, A., Abbeel, P., & Russell, S. (2016). The off-switch game. Preprint at arXiv:1611.08219. Retrieved from https://arXiv.org/abs/1611.08219v3.
Hardesty, L. (2016). Making computers explain themselves. MIT News. Retrieved from http://news.mit.edu/2016/making-computers-explain-themselves-machine-learning-1028.
Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. Advances In Neural Information Processing Systems (pp. 3315–3323).
Lei, T., Barzilay, R., & Jaakkola, T. (2016). Rationalizing neural predictions. Retrieved from https://arXiv.org/abs/1606.04155v2.
Malle, B. F., & Scheutz, M. (2014). Moral competence in social robots. Ethics in Science, Technology and Engineering, 2014 IEEE International Symposium (pp. 1–6). IEEE.
Malle, B. F., Scheutz, M., Arnold, T., Voiklis, J., & Cusimano, C. (2015). Sacrifice one for the good of many? People apply different moral norms to human and robot agents. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (pp. 117–124). ACM.
Moor, J. H. (2006). The nature, importance, and difficulty of machine ethics. IEEE Intelligent Systems, 21(4), 18–21.
Neff, G., & Nagy, P. (2016). Automation, algorithms, and politics| talking to bots: symbiotic agency and the case of tay. International Journal of Communication, 10, 17.
Orseau, L., & Armstrong, S. (2016). Safely interruptible agents. Retrieved from https://ora.ox.ac.uk/objects/uuid:17c0e095-4e13-47fc-bace-64ec46134a3f.
Park, D. H., Hendricks, L. A., Akata, Z., Schiele, B., Darrell, T., & Rohrbach, M. (2016). Attentive explanations: Justifying decisions and pointing to the evidence. Preprint at arXiv:1612.04757. Retrieved from https://arXiv.org/abs/1612.04757v2.
Riedl, M. (2016). Big red button. Retrieved from https://markriedl.github.io/big-red-button/.
Sample, I. (2017). Give robots an ‘ethical black box’ to track and explain decisions, say scientists. The Guardian. Retrieved from https://www.theguardian.com/science/2017/jul/19/ give-robots-an-ethical-black-box-to-track-and-explain-decisions-say-scientists.
Satell, G. (2016). Teaching an algorithm to understand right and wrong. Harvard Business Review. Retrieved from https://hbr.org/2016/11/teaching-an-algorithm-to-understand- right-and-wrong.
Scheutz, M. (2014). The need for moral competency in autonomous agent architectures. In Fundamental issues of artificial intelligence (pp. 515–525). New York: Springer.
Wallach, W., & Allen, C. (2008). Moral machines: Teaching robots right from wrong. Oxford: Oxford University Press.
White House. (2016). The Administration’s report on the future of artificial intelligence. Retrieved from https://www.whitehouse.gov/blog/2016/10/12/administrations-report-future- artificial-intelligence.
Wiedenmeier, B. (2016). A warning to people who bike: Self-driving Ubers and right hook turns | San Francisco bicycle coalition. Sfbike.org. Retrieved from https://www.sfbike.org/news/a-warning-to-people-who-bike-self-driving-ubers-and-right-hook-turns/.
Funding
Funding was provided by Office of Naval Research (Grant No. ONR #N00014-16-1-2278).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Arnold, T., Scheutz, M. The “big red button” is too late: an alternative model for the ethical evaluation of AI systems. Ethics Inf Technol 20, 59–69 (2018). https://doi.org/10.1007/s10676-018-9447-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10676-018-9447-7