If we are to make responsible decisions about regulating and using AI-powered machines we need to know a lot more about them than we often do. This is especially true for modern AI algorithms (e.g., deep learning) which are opaque with regard to their reasoning. The training data, inputs, outputs, function, and boundaries of these machines must be known to us.
Training data
The data used to train machine-learning algorithms are extremely important with regard to how that algorithm or machine will work. Two algorithms that share the exact same code could work wildly differently because they were trained using different datasets. A facial recognition system trained only using pictures of faces of old white men will not work very well for young black women. If someone is to buy a facial recognition algorithm then there should be some information about the faces used to train it. The number of faces and the breakdown of age, ethnicity, sex, etc., would be a basic start. The specifics regarding what information is needed about the training data will obviously vary depending on context and type of data.
The knowledge regarding training data will be important when implementing algorithms. Simply knowing that the training data lack a certain demographic would hopefully cause one to test the system before using it on such a demographic or to restrict its use to demographics covered by the training data. For example, algorithms made to detect skin cancer were trained on images of moles mostly from fair-skinned patients—meaning the algorithm does poorly with regard to darker skinned patients (Lashbrook 2018). Whatever the reasons for this biased training data, it is important to know this before such an algorithm is used on a dark-skinned patient.
Knowledge of training data can also help to determine unacceptable algorithms which will simply reinforce societal stereotypes (Koepke 2016; Ensign et al. 2017). Predictive policing algorithms which rely upon training data that is biased against African Americans simply should not be used. The knowledge of this bias would not lead to its envelopment; rather, it should, if possible, lead to fixing the training data.
Boundaries and inputs
The terms ‘boundaries’ are construed broadly. Not only does it mean physical boundaries in the case of a robot, but also virtual boundaries which refer to the possible inputs (or types of input) in the form of data that it could encounter. ‘Boundaries’, then, refers to an algorithm’s or robot’s expected scenarios. For example, AlphaGo expects as an input a GO board with a configuration of white and black pieces. AlphaGo is not expected to be able to suggest a chess move based on an input of a chess board with a configuration of pawns, knights, bishops, rooks, queens, and kings on it. An algorithm playing chess is fine, but is a different algorithm than AlphaGo.
Knowing precisely what the boundaries a machine is constrained by helps us know what the possible inputs are. For example, a Roomba vacuum will have the boundaries of one floor of a home or apartment. A user is given a limited space with which to make sure that the robot will function properly. We can imagine a seeing eye robot which is given the task of guiding the blind when they go outside of the home. Now we have a machine whose boundaries are potentially limitless. It would be impossible to know all the possible situations the machine could face. In other words, the inputs to the machine are limitless. With the Roomba, however, one can survey the floor and detect possible problem inputs—the human has the information needed to envelop the machine.
Boundaries are different from inputs. A machine’s inputs are determined by its sensors or code. The seeing eye robot above may have cameras, microphones, and haptic sensors all serving as inputs into the machine. An ‘input’ as I want to talk about it here is the combined data from all sensors. We, as humans, make decisions based on a number of factors. For example, we might put on a rain jacket because: it is raining, it is not too cold outside (otherwise we would opt for a heavy jacket), and we are going to be outside. A machine might be able to tell a user to wear a rain jacket based on the same data because it has a temperature sensor to sense how cold it is outside, a data feed from a weather website (to ‘sense’ that it is raining), and a microphone to hear the user say they need to go outside. It is the combination of these data which determines what output will be given.
Therefore, we not only need to know what types of inputs there are (sound, image, temperature, specific voice commands, data feeds, etc.), but how these get combined to form one input. There are machines which take very limited inputs which make very important classifications. The machine capable of detecting cancerous moles can only accept an image of a mole as an input. We have a very clear understanding of the inputs of this machine. On the other hand, a driverless car has many sensors which combine to provide infinite combinations of inputs.
I do not mean to suggest that a machine which can accept infinite combinations of inputs should not be used. We simply must know that this is the situation. We may know that an AI app on our phone accepts data from weather stations, our voice commands, images of our face, etc., as well as feedback after its decision (so that it can improve). Furthermore, it may not have any real boundaries—that is, it has the ability to grab data from other sources if it helps to improve its decisions. However, the function of the machine may simply be to decide whether or not to advise the user to wear a jacket. That is, it only has two outputs: jacket, or no jacket. We can debate about the overkill regarding using AI for advice on our outdoor clothing; however, the point is that a decision about the acceptability of a machine requires not only knowing its boundaries and inputs, but its function and outputs as well.
Functions and outputs
Knowledge of the functions and possible outputs of a machine is essential if we are to achieve the goal of enveloping AI-powered machines. In the AlphaGo example, the output is a legal move in the game of GO. We might be shocked by it making a particular move, but it is nonetheless a legal move in the game of GO. It would be strange if the function of AlphaGo were defined as “not letting an opposing player win” and instead of making a move its output was to mess up the board (because it knew there was no chance of winning and this was the only way to ensure that the other player did not win).
It can be easy to think that functions and outputs are equivalent. In the case of the jacket-deciding machine in the previous section, the function of the machine is to advise the user on whether or not to wear a jacket. This is the same as its output which is either “jacket” or “no jacket”. This, however, is often not the case. The function of a driverless car is to drive from point A to point B; however, this will involve many outputs. Each turn, acceleration, swerve, and brake is an output. Defined functions are of the utmost importance because they allow us to test the machines for efficacy. How well a machine functions is clearly salient with regard to its moral acceptability. If the malignant mole-detecting algorithm was seldom successful at categorizing moles, then it would be unethical to use it. Equally unethical is the use of the algorithm when we are ignorant with regard to how successful it is (i.e., use outside of a testing environment).
Outputs are not the same as a machine’s function; however, they can be discussed in the same way that we talk about a machine’s capabilities. What can the machine do? A driverless car may be able to go 200 mph—which means that this is a possible output. A drone may have a machine gun built in, giving it the capability to shoot bullets—which means a possible output is the shooting of bullets. This example makes it clear why it is so important to have knowledge regarding the functions, outputs, boundaries, and inputs. A machine whose possible output is to shoot bullets may be acceptable if its only input is a user telling it to shoot and its boundaries are a bulletproof room. We need all these knowledge to make informed decisions regarding the acceptability of machines.
Stepping out of the dark
Knowing what the inputs, boundaries, training data, outputs, and functions of an AI-powered machine will allow us to have some clue as to the envelopes these machines should be operating in. Even when machines are operated in environments which are so broad that we cannot prevent novel scenarios, the knowledge that this is the case helps inform our decisions regarding such a machine’s acceptability. If there are possible novel environments (and, therefore, we are ignorant to the possible inputs), then the outputs must be such that it does not matter. No matter what novel board configuration of the game GO is given to AlphaGo, the output is always a legal move of GO. It is simply not possible for a harmful output. It would not matter if AlphaGo took as its inputs live CCTV video feeds from all over the world—the outputs would always be the same benign GO moves (although such inputs would probably not help with the stated goal of winning the game of GO). This is in direct contrast to the situation we face with driverless cars. Their possible inputs are states of affairs on just about any road in the world—with the weather, pedestrians, other cars, etc., all combining to create consistently novel inputs. In this case though, the outputs are potentially fatal.
Machines which have clear specifications regarding the properties listed in Sects. 4.1, 4.2 and 4.3 limit these problems. Cortis is an algorithm which detects voice patterns associated with cardiac arrest (Vincent 2018). The algorithm exists explicitly for the purposes of aiding emergency call operators (we know its function). The algorithm takes as its input live sound from the calling line. Its output is true if the voice pattern is associated with cardiac arrest and false if it is not (explicitly defined outputs). This algorithm being so explicit means that we have the knowledge to determine that this is an acceptable machine. If the machine is used within the boundaries given, then we can easily figure out what the possible scenarios are—without understanding how the machine comes to its decision. The machine either outputs true or false. If true, and a person on the end of the phone line is indeed having a heart attack, then the machine may be instrumental in preventing death. If the output is true, and no one on the end of the phone line is having a heart attack, then emergency services may be sent out without it being necessary. While this is not an ideal situation, knowing that it could occur gives us the knowledge to decide whether this risk is worth it. If the machine outputs false, and no one on the end of the line is having a heart attack, then the emergency call is unaffected by the machine. The last scenario is the machine outputting ‘false’ when someone on the line is having a heart attack. This is the worst scenario; however, the consequences of the machine acting this way are no different from the consequences of the emergency call without the machine. Again, the knowledge that this could happen is necessary for us to decide whether this is an acceptable risk.
We can imagine a machine which would operate in a similar context which could result in unacceptable risk—because we do not have the knowledge necessary to make an informed choice. This machine would be a robot which would be assigned the task of triaging incoming patients. The robot would be able to ‘see’ the patients, talk to them, and decide their place in line. The sheer number of possible inputs to this machine makes it difficult to determine how people could be harmed. In one obvious way, the machine could underestimate the seriousness of a person’s situation resulting in their death. The possible harms are numerous and unpredictable. It could be that the machine results in less harm than when human beings are responsible for triaging; however, empirically validating this is next to impossible—especially before these machines are implemented.
If we are in the dark about the inputs, boundaries, functions, and outputs, then we have a machine we do not know enough about to properly envelop—leading to its possible failure which will often be an unacceptable risk to human beings. For, with modern AI, we are already in the dark about how it makes decisions. An undeveloped machine means that we are also in the dark about what could happen with these machines.
Ideally, AI-powered machines will be designed for envelopment—with clear ideas about the training data, inputs, functions, outputs, and boundaries. This knowledge would clearly be necessary to properly design for values or to facilitate an ethicist as part of the design team (van Wynsberghe and Robbins 2014). Not only would this result in ethically better designs but may prevent a waste of resources on a machine which cannot be enveloped and, therefore, may be designed to fail.