The “big red button” is too late: an alternative model for the ethical evaluation of AI systems
As a way to address both ominous and ordinary threats of artificial intelligence (AI), researchers have started proposing ways to stop an AI system before it has a chance to escape outside control and cause harm. A so-called “big red button” would enable human operators to interrupt or divert a system while preventing the system from learning that such an intervention is a threat. Though an emergency button for AI seems to make intuitive sense, that approach ultimately concentrates on the point when a system has already “gone rogue” and seeks to obstruct interference. A better approach would be to make ongoing self-evaluation and testing an integral part of a system’s operation, diagnose how the system is in error and to prevent chaos and risk before they start. In this paper, we describe the demands that recent big red button proposals have not addressed, and we offer a preliminary model of an approach that could better meet them. We argue for an ethical core (EC) that consists of a scenario-generation mechanism and a simulation environment that are used to test a system’s decisions in simulated worlds, rather than the real world. This EC would be kept opaque to the system itself: through careful design of memory and the character of the scenario, the system’s algorithms would be prevented from learning about its operation and its function, and ultimately its presence. By monitoring and checking for deviant behavior, we conclude, a continual testing approach will be far more effective, responsive, and vigilant toward a system’s learning and action in the world than an emergency button which one might not get to push in time.
KeywordsArtificial intelligence Ethics Computational architecture
Funding was provided by Office of Naval Research (Grant No. ONR #N00014-16-1-2278).
- Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. Preprint at arXiv:1606.06565. Retrieved from https://arXiv.org/abs/1606.06565v2.
- Ananny, M., & Crawford, K. (2016). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media and Society. Retrieved from http://journals.sagepub.com/doi/full/10.1177/1461444816676645.
- Anderson, M., & Anderson, S. L. (2007). Machine ethics: Creating an ethical intelligent agent. AI Magazine, 28(4), 15.Google Scholar
- Arnold, T., Kasenberg, D., & Scheutz, M. (2017). Value alignment or misalignment: What will keep systems accountable? AAAI Ethics Workshop. Retrieved from https://hrilab.tufts.edu/publications/ aaai17-alignment.pdf.
- Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford: Oxford University Press.Google Scholar
- Boyd, D. (2016). Transparency ≠ accountability. Data & Society: Points. Retrieved from https://points.datasociety.net/transparency-accountability-3c04e4804504#.t8eg2c1fl.
- Briggs, G., & Scheutz, M. (2016). Why robots must learn to tell us no. Scientific American. Retrieved from https://www.scientificamerican.com/article/why-robots-must-learn-to-tell-us-ldquo- no-rdquo/.
- Cuthbertson, A. (2016) Google’s “big red button” could save the world from AI. (2016). Newsweek. Retrieved from http://www.newsweek.com/google-big-red-button-ai-artificial- intelligence-save-world-elon-musk-466753.
- Davies, A. (2016). Surprise! Uber refuses to stop self-driving cars in SF. Wired. Retrieved from https://www.wired.com/2016/12/uber-refuses-stop-self-driving-sf-setting-legal- showdown/.
- Devlin, H. (2016). Discrimination by algorithm: Scientists devise test to detect AI bias. The Guardian. Retrieved from https://www.theguardian.com/technology/2016/dec/19/ discrimination-by-algorithm-scientists-devise-test-to-detect-ai-bias.
- Ewing, J. (2015). Volkswagen says 11 million cars worldwide are affected in diesel deception. The New York Times. Retrieved from http://www.nytimes.com/2015/09/23/business/international/volkswagen-diesel-car-scandal.html?_r=0.
- Gershgorn, D. (2016). We don’t understand how AI make most decisions, so now algorithms are explaining themselves. Quartz. Retrieved from http://qz.com/865357/we-dont- understand-how-ai-make-most-decisions-so-now-algorithms-are-explaining-themselves/.
- Hadfield-Menell, D., Dragan, A., Abbeel, P., & Russell, S. (2016). The off-switch game. Preprint at arXiv:1611.08219. Retrieved from https://arXiv.org/abs/1611.08219v3.
- Hardesty, L. (2016). Making computers explain themselves. MIT News. Retrieved from http://news.mit.edu/2016/making-computers-explain-themselves-machine-learning-1028.
- Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. Advances In Neural Information Processing Systems (pp. 3315–3323).Google Scholar
- Lei, T., Barzilay, R., & Jaakkola, T. (2016). Rationalizing neural predictions. Retrieved from https://arXiv.org/abs/1606.04155v2.
- Malle, B. F., & Scheutz, M. (2014). Moral competence in social robots. Ethics in Science, Technology and Engineering, 2014 IEEE International Symposium (pp. 1–6). IEEE.Google Scholar
- Malle, B. F., Scheutz, M., Arnold, T., Voiklis, J., & Cusimano, C. (2015). Sacrifice one for the good of many? People apply different moral norms to human and robot agents. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (pp. 117–124). ACM.Google Scholar
- Neff, G., & Nagy, P. (2016). Automation, algorithms, and politics| talking to bots: symbiotic agency and the case of tay. International Journal of Communication, 10, 17.Google Scholar
- Orseau, L., & Armstrong, S. (2016). Safely interruptible agents. Retrieved from https://ora.ox.ac.uk/objects/uuid:17c0e095-4e13-47fc-bace-64ec46134a3f.
- Park, D. H., Hendricks, L. A., Akata, Z., Schiele, B., Darrell, T., & Rohrbach, M. (2016). Attentive explanations: Justifying decisions and pointing to the evidence. Preprint at arXiv:1612.04757. Retrieved from https://arXiv.org/abs/1612.04757v2.
- Riedl, M. (2016). Big red button. Retrieved from https://markriedl.github.io/big-red-button/.
- Sample, I. (2017). Give robots an ‘ethical black box’ to track and explain decisions, say scientists. The Guardian. Retrieved from https://www.theguardian.com/science/2017/jul/19/ give-robots-an-ethical-black-box-to-track-and-explain-decisions-say-scientists.
- Satell, G. (2016). Teaching an algorithm to understand right and wrong. Harvard Business Review. Retrieved from https://hbr.org/2016/11/teaching-an-algorithm-to-understand- right-and-wrong.
- Scheutz, M. (2014). The need for moral competency in autonomous agent architectures. In Fundamental issues of artificial intelligence (pp. 515–525). New York: Springer.Google Scholar
- Wallach, W., & Allen, C. (2008). Moral machines: Teaching robots right from wrong. Oxford: Oxford University Press.Google Scholar
- White House. (2016). The Administration’s report on the future of artificial intelligence. Retrieved from https://www.whitehouse.gov/blog/2016/10/12/administrations-report-future- artificial-intelligence.
- Wiedenmeier, B. (2016). A warning to people who bike: Self-driving Ubers and right hook turns | San Francisco bicycle coalition. Sfbike.org. Retrieved from https://www.sfbike.org/news/a-warning-to-people-who-bike-self-driving-ubers-and-right-hook-turns/.