The new framework can inform the ethical evaluations and subsequent action planning of managers, public policy makers, and civil society groups to better understand the implications of, and accountability responses to, AI robot applications. This is not intended as a prescriptive or static framework, given the inherent variation in application forms, movements in the state of technology and, crucially, shifts in societal expectations of what is and is not morally acceptable and legitimate (Suchman, 1995). We suggest two axial themes driving the framework—locus of morality and moral intensity—that combine in unique ways to render specific ‘clusters of accountability’ necessary for AI applications in business (Fig. 1). Figure 1 captures the increasingly dispersed nature of accountability, as an outcome of rising moral intensity and AI agency, and how this provokes different kinds of actors and levels of analysis. Our four accountability clusters correspond with the types of actors present at a particular level. Thus, in situations of concentrated accountability (i.e., low dispersal, low moral intensity, and human agency) AI robot accountability may be affected through well-defined and local actors. These could include the supplying AI designer company and the implementing company and user/s. An example of this might be deploying AI cleaning or even stocking robots within large warehouses. Contrastingly, in situations of widely dispersed accountability (i.e., high dispersal, high moral intensity, and low human agency), accountability clusters may draw in numerous, formal and informal actors across macro-institutional and supra-territorial arenas in order to provide accountability. An example of these kinds of AI robot settings could be international peacekeeping, military and/or humanitarian applications where deployments require accountability across different geo-political and legal arenas.
The Accountability Challenge of AI Robots
Figure 1 highlights our connection between the locus of morality and moral intensity in a context-specific AI robot application setting, along with the corresponding accountability mechanisms likely required (which will be discussed in more details further below). Drawing this link in relation to AI robot applications is necessary for the following reasons: to start, in a non-AI environment, if a manager or employee were to make an unethical decision or engage in an illicit practice (e.g., harassment, discrimination, deception, theft), the question of whom should be accountable is likely fairly concentrated (e.g., local/proximity to the person, department, or organization) such that the individual decision-maker—the moral agent—may be held solely responsible. The administration of any punitive or restorative accountability mechanisms also is likely to be local (line manager, human resource management, training programs, whistle-blowing procedures). Crucially, the focus and scope of these accountability mechanisms are not widely dispersed. However, introducing AI robots into such settings greatly complicates the interrelationship between the would-be wrong-doer (or unconsciously/complicit ‘wrong-doer’ for that matter) and corresponding restorative mechanisms, which the technological opacity associated with using AI robots exacerbates.
In the event of an ethical issue (perhaps in error) that causes harm directly, or facilitates harm indirectly, the question of accountability is far more complex. Is the AI robot, supporting AI system, developer, implementing organization, overseeing manager, industry regulator, or government responsible? Martin (2019, p. 129) broached this problem with regard to the question of accountability for the un/intended consequences of algorithms, whose agency varies from “simple if–then statements to artificial intelligence (AI), machine learning and neural networks.” Because of this, Martin finds (2019:130) it inevitable that “all algorithmic decisions will produce mistakes” that, if left undetected, could reproduce unfairness, inequality, and harm for different stakeholders. Thus, even if ‘Roboethics’ are designed-in, for example, to inhibit AI robots’ unethical decisions, ethical problems may emerge and persist that disperse accountability throughout a range of actors and activities beyond the initial designer. We thus anticipate something akin to ‘Algorithmic Accountability’ to occur in the context of AI robot applications and urge discussing the context-specific mechanisms for accountability that would include, but extend well beyond, the initial coding role of the AI designer or user (Table 1 suggests a preoccupation with designer-user accountability). Therefore, the next sections discuss key drivers of accountability of AI robots’ use for the business ethics and wider management studies community. Table 2 connects the ‘accountability clusters’ with the ethical categories of illegal, immoral, permissible, and supererogatory. Each intersection is illustrated with examples of AI robots’ use for different purposes and in different settings.
Locus of Morality
AI robots’ use influences the extent to which the locus of morality, defined as the autonomy to choose an ethical course of action (Bruder, 2020), is mostly concentrated in human agents (weak AI agency) versus human agency being less straightforward with concentration towards AI robots (strong AI agency). Strong AI agency does not imply the lack of human agency but instead the hidden nature of human agency. Recent research from the field of ‘Roboethics’ (Leventi et al., 2017) indicates that AI robots may not only share agency with humans in application settings but may learn from them and ‘improve’ (machine learning). Advancements in robotics have led to the emergence of ‘smart robots,’ which Westerlund (2020, p. 35) defines as “autonomous artificial intelligence (AI) systems that can collaborate with humans. They are capable of “learning” from their operating environment (…) in order to improve their performance.”
Theoretically, repeating poor human decisions, not noticing certain harms or injustices or even unwittingly causing them, are all possibilities of escalating AI moral agency. In building our vertical axis (Fig. 1) we combine the notion of the locus of morality with Martin’s (2019, p. 131) assertion that “Algorithms relieve individuals from the burden of certain tasks, similar to how robots edge out workers in an assembly line. Similarly, algorithms are designed for a specific point on the augmented- automated continuum of decision-making.”
We note here that the more extreme ends of the upper continuum—AI robots as autonomous ethical decision-makers—are at present theoretical. Full autonomy is a hypothetical scenario that serves as an orientation point rather than an actual, attainable quality of AI robots. This is subject to the state of technology and social license around AI acceptance over time, and even strong AI ethical agency may well experience some ‘bounded’ autonomy. However, Westerlund (2020) informs us about the potential for a broad spectrum of AI agency, ranging from robots as passive recipients of human ethics (the object of programmed ethical codes) to highly active agents (subjects of ethical judgment). Westerlund (2020) also suggests that AI robots may become recipients and shapers of the socio-cultural ethical norms at a more macro-social level over time. Thus, the locus of morality is likely to shape both ‘local’ accountability responses, emphasizing the role of designer-user accountability solutions (Bench-Capon, 2020) as well as broader macro-social transformations that may require temporally undefined responses by a constellation of organizational, governmental, and civic agents and structures.
These considerations have informed how we structure the vertical axis in Fig. 1—the ‘locus of morality.’ In normative ethical theory, a primary assumption is that ethical decisions about the respective harm or freedom resulting from an action (right rule/best outcome) is conditional upon a rational choosing agent. Depending on the strand of moral philosophy, this could involve necessary human characteristics such as capacities for moral reasoning about rights and consequences of actions, a socially acquired sense of virtue and moral character as well as, more existentially, some innate notion of a moral impulse, empathy, and/or care for others. The latest research in business ethics suggests that managers can call upon their ‘moral imagination’ (Johnson, 1994), including both reasoning, empathy, and sheer intuition, in reaching the best possible ethical decision (Tsoukas, 2020). The current force of technological development, at least for some AI robot applications, is towards human imitation, including, especially, the way we make choices. This would most certainly include choices of an ethical nature (or outcome), ranging from compliance and imitation to, potentially, autonomous judgment. There is a moral risk of imitation, however, while “AI system is a tool that will try to achieve whatever task it has been ordered to do. The problem comes when those orders are given by humans who are rarely as precise as they should be” (Kaplan & Haenlein, 2020, p. 46).
Thus, the vertical axis captures on a continuum from human to AI robot, where the locus of moral decision-making increasingly resides. At the lower end, what others have called ‘Assisting AI,’ such as AI-assisted diagnostics, humans largely set ethical parameters for AI robots, who in all cases would make the final judgments regarding ethical decisions. Humans can program ethical codes and make any necessary adjustments. The AI robots cannot do this. The boundaries between human and AI robots as the origin (locus) of ethical decisions become increasingly blurred as we move up the continuum, where both humans and AI robots may assume different amounts of autonomy over ethical decisions. The top of the continuum represents a theoretically possible position (Westerlund, 2020), where humans have almost no part in any ethical decision-making, leaving this entirely to the AI robot and questioning, ultimately, who is accountable for an un/ethical decision executed (or not) in this context (e.g., the system, the product, the company or the government).
This leads us to the second determinant of accountability, that of moral intensity. Moral intensity of a given situation has been well documented in the descriptive ethical theory literature, most notably with Jones’ (1991) issue-contingent model showing how perceptions of moral intensity affect an individual’s decision-making. As we are concerned with how accountability clusters develop for particular AI robot application settings, we deploy moral intensity in a different way from Jones, defining it as the context-specific exigencies of vulnerability and scale that amplify un/intended consequences in AI robot application settings (i.e., the focus here is not the decision-making subject’s perception of moral intensity). As we explain later in our four clusters of accountability, moral intensity may stem from (any combination of) numbers of humans potentially effected, the vulnerability of human agents and/or the severity of current and legacy effects on a community or ecosystem. As our approach is a descriptive one, we will not discuss here how designers from deontological or teleological frameworks formulate AI robots’ decisions (see Bench-Capon 2020 for an explicit discussion on these).
In ethical theory, including normative ethics, researchers commonly emphasize the moral intensity of a decision. For example, the magnitude and distribution of consequences—both beneficial and harmful—is the task of utilitarian ethical theory. Theories of justice recognize the relative vulnerability of certain actors over others, rendering them more vulnerable to receive potential harm or have access to benefits denied. In each case, harm/benefit has a certain magnitude or intensity to the decision-making. In Heath’s conceptualization, moral intensity comes with increased interaction: “iteration of the interaction only intensifies their [individuals’] incentive to act in the same [morally questionable] way” in competitive settings (Heath, 2007, p. 360).
Overall, there is a spectrum of situations between low and high morally intensity (Jones, 1991). Factors such as time, magnitude, proximity, and distribution of consequences can play a role. An individual employee regularly taking longer rest breaks than his/her colleagues is on a different moral intensity compared to a hospital that prioritizes profits over patient safety in the long run. Using examples in our framework, we can similarly argue that a faulty AI cleaning robot, for the most part, will result in comparatively benign outcomes (getting briefly lost) than will an AI robot that fails in the correct diagnosis in health service encounters.
Accountability Clusters for AI Robots
While we have already begun to discuss AI accountability above, first it is necessary to draw a detailed link between our theoretical constructs and their corresponding accountability clusters as Fig. 1 presents. The locus and intensity of morality in AI robot settings necessitate special consideration of accountability and governance. Although concerned principally with algorithms, Martin (2019) underlines the inevitability of decision mistakes (both human and system). Intentionally or otherwise, occasionally managers will make poor decisions. The issue here, and of pressing concern for business ethics scholars, is what happens when mistakes do occur (e.g., mis-reporting a company’s financial health): “Ungoverned decisions, where mistakes are unaddressed, nurtured, or even exacerbated, are unethical. Ungoverned decisions show a certain casual disregard as to the (perhaps) unintended harms of the decisions; for important decisions, this could mean issues of unfairness or diminished rights” (Martin, 2019, p. 132).
We argue that while AI robots’ capability to collect and record information that may indicate mistakes (or causes of) is extremely powerful, the capacity for identification, interpretation, judgment, and deliberation may be correspondingly minimal. Weak AI ethical agents may not recognize unethical decisions that require correcting. Moderate AI ethical agents may imitate and repeat them (as ‘good’). Strong AI ethical agents may overlook serious harms (e.g., via lack of empathy) in pursuing other ‘good’ organizational ends. In short, AI robots’ decisions and practices that precipitate harm, inequality, and unfairness may go uncorrected over time. Thus, it is necessary for stakeholders to think seriously about accountability issues in specific AI robot settings. In response, we suggest four accountability clusters as indication of actors, resources, and activities to address accountability. In short, accountability requirements, as determined by the locus of morality and moral intensity, may be local, concentrated, and ad hoc (organizational supervisor), or widely dispersed across private, public, and civil society agents in an ongoing discourse of accountability (Buhmann et al., 2019). There will be markedly different requirements for accountability clusters between, for example, situations where the locus of morality is mostly concentrated within human agents (weak AI ethical agency) and where moral intensity is low (e.g., office cleaning) compared with situations of stronger AI ethical agency and where moral intensity is far higher (e.g., AI carers or soldiers). However, we do not provide new ethical norms (e.g., principles for managerial accountability per se) but indicate context-specific accountability clusters that may well include norm-making administrative mechanisms.
In this section, we respond to Buhmann et al.’s (2019) warning from algorithmic research that there may be special, discrete accountability characteristics specific to AI and system learning technologies (such as technical opacity) that render expectations of accountability as highly fluid and nuanced. Rather than provide a rigid typology of accountability mechanisms that managers or policy makers must follow, we interpretively develop clusters that fall loosely into the four different clusters of accountability in Fig. 1. In a sense, we demonstrate here how managers or policy makers might use our theorization in attempts to combine scenario planning around AI robot accountability. From our theorization, we developed four clusters (see Fig. 1) that delineate context-specific applications of AI robots. This enabled corresponding consideration of appropriate ethical categories. Note here that the depicted clusters are not mutually exclusive but cumulative, each of them is nested inside the other, like Russian (matryoshka) dolls, incorporating an increasing number of agents as moral intensity and agency rises. For each accountability cluster, we will also discuss the normative ethical properties of illegal, immoral, permissible, and supererogatory. We characterize the ethical dynamics and corresponding accountability clusters, providing further corresponding examples of AI applications (see sources in relation to the examples in Table 3).
Cluster 1 Professional Norms
Cluster 1 represents a relatively local and concentrated accountability cluster, characterized by applications with low AI ethical agency and low moral intensity where questions of accountability are largely contained to a well-defined designer–device relationship. Akin to safety certification schemes for products, designers make ethical decisions and then encode them into non-reflexive task and behaviors that AI robots can and cannot do. We distinguish here between AI robots as imitators (such as certain chat bots that support booking processes), and AI robots in this context that follow simply pre-programmed codes of conduct (such as smart heating systems). Considering that designers shape the technical features and may incorporate ethical considerations, professional norms play a key role here, especially considering that designers are unlikely to receive much pressure from governments and international bodies on AI robots’ use. It is at this lowest level of agency and intensity that we would situate models mainly pertaining to supererogation. For instance, smart heating systems achieving environmentally friendly solutions without imposing significant risk on stakeholders is a good example. However, ‘low’ risk does not imply ‘no’ risk of ethical issues arising, implying that a significant degree of accountability is always required, albeit more locally administered in these settings.
Low moral intensity and a locus of morality that is closely attributable to humans characterizes this cluster. We include smart heating and cleaning systems at homes in this category: humans set the exact activity details (for instance, the cleaning route and sequence) and we consider the activity type typically harmless. Immoral or illegal use of these AI robots appears to be unlikely, thus their moral intensity is limited. The locus of responsibility is primarily with humans (e.g., to select the degree to which the property would be heated). Overall, this group represents low ethical risks, which where appropriate, require human supervision, even if these measures play more of a preventive role rather than managing previously experienced issues with certain AI robots. It is difficult to construe illegal applications of, for instance, smart heating and cleaning systems, but we cannot exclude it as a possibility that some smart systems have a reprogramming capacity for the unsolicited monitoring of someone’s private life or business matters besides their original purposes (e.g., cleaning, heating). A potential immoral application is when someone misleads others about his/her inability to attend an event due to compromised mobility and attends another event instead, with the support of a mobility AI robot. An example for permissibility in this category is that a cleaner may need to seek other work if his/her client sets up a smart cleaning system at home or workplace.
Cluster 2 Business Responsibility
Cluster 2 represents an accountability cluster characterized by moderate moral agency (with the locus of responsibility still closer linked to humans/groups of humans) within contexts where moral intensity is moderate. This could mean contexts where there are few humans or, for instance, the nature of the task poses little threat to humans or ecosystems, despite an increased level of AI agency. Interestingly, application contexts that might prove difficult or impossible for complex organic life to operate in, such as mining, deep sea, or space exploration, could invite AI robots with considerable degree of autonomy. The relative lack of ecological or human threat would likely result in a more concentrated cluster of mostly professional and/or organizational-level actors within temporally bounded moments (e.g., user organization following pre-existing industry regulations). This makes ‘business responsibility’ characteristic in this cluster, which refers to the liability of the organization that uses the AI robot. Moreover, it is in this cluster that we might see potential permissible decisions, practices, and outcomes. An emphasis on setting clear parameters for AI robots based upon organizational values, goals, mission, and codes of conduct would most likely complement designer-lead AI robot ethics.
Agricultural AI robots for insect detection are part of this group with nearly full autonomous operations that have low physical or other risks to humans. In addition, AI robots intended for weeding and seed-planting belong to this category. Similarly, repair and inspection AI robots can enter spaces that humans would struggle to reach and use sensors that accelerate the sensing capacities humans have. For example, the flexible elastomeric two-head worm AI robot imitates inch-worm motion, holds sensors that explore their environment and learn repair-work patterns. Companies can use it for repair and inspection (Tahir et al., 2018). The permissibility of such AI robots lies in that although larger groups of workers may lose their jobs, the humans who continue the work in AI robot-enhanced environments can enjoy improved work conditions (Fleming, 2019). With fewer workers in an AI-robotized environment, there is a reduced risk of workers engaging in dangerous tasks. The increased safety element represents supererogation. Unlikely potential illegal applications include causing harm to humans due to negligence or as a planned criminal act, or the even more efficient harvesting of illegal drugs with the help of AI robots. An immoral application may be to pressurize the human workforce to ‘compete’ with the AI robots’ performance carried out in different sites of an organization. The presence of a ‘supervising’ human agent may still be required to provide ad hoc and/or strategic monitoring to ensure alignment with industry codes, as well as correcting any ethical ‘blind spots’ that designers or organizations may have.
Cluster 3 Inter-institutional Normativity
Cluster 3 reflects situations of relatively high AI ethical agency coupled with relatively high moral intensity. Accountability may be relatively dispersed between actors subjected to institutional norms of, for example, a regional industry and/or national context that prompt the need for interorganizational liaising on ethical implications. Regulatory, industry, trade union, and civil society institutional actors, for example, might be present but in a national or region-specific context. Inter-institutional normativity refers to the nature of decision-making processes in which one concludes actions and outcomes to be ethically desirable. In this cluster the interaction between different organizations plays a significant role (instead of the focus on a single organizational setting). AI robot applications in this group deserve special attention to minimize occurrences that involve immoral decisions, practices, and outcomes. While we do not exclude legal mechanisms altogether (after all laws have a moral basis), we emphasize here the focus upon institutions—including certain governmental bodies—relevant for how industry uses AI robotics. We might anticipate the content of institutional norms to vary according to the geo-political contexts; however, the presence of institutional norms would likely reflect some kind of social contract to protect citizens in contexts of heightened vulnerability.
Examples in this cluster are the use of AI-supported healthcare data management systems (with the need for interorganizational liaising between professional healthcare bodies, programmers, and the government) as well as AI-supported crime-prevention systems (typically operated at the national level, even though international collaborations are increasingly important for crime prevention). Humans may maintain a strategic control of care planning and resource allocation decisions in the healthcare and crime-prevention data management systems, and then they implement these decisions in AI robots’ daily encoded tasks. Similarly, AI robots applied for social auditing that can encourage social distancing and other relevant safety measures belong to this group. Potential ethical transgressions are less likely to originate from the AI robot itself (as it is an imitator), but the lack of AI robot judgment means that any un/intended consequences of poor human decisions may go routinely unnoticed and perpetuated by AI robots. The lack of AI reflexivity could perpetuate unfairness, inequality, social exclusion or even harm, in various learning and rehabilitation environments. Strong normative institutional accountability mechanisms need to be in place to not only set the parameters for actors implementing AI robots in such settings but to measure and provide feedback upon shortcomings against a set of agreed norms. An example for when accountability mechanisms may not have been in place is when UK-wide databases of more than 400 thousand criminal records, including criminals’ fingerprint information, have been deleted (Reuters, 2020). Deleting criminal records could have occurred in the case of non-AI-enhanced systems as well, but AI robots can accelerate the speed and volume of this data-specific damage that is vital from a social security perspective. The absence of appropriate data-monitoring mechanisms raises the question of immorality at institutional levels (in this case the police). We cannot exclude entirely the possibility of illegality. For example, in a hypothetical scenario, a police officer who had access to the compromised criminal records system might want to cover up someone’s criminal activities, but this is yet to be consolidated as the investigation is under way.
Cluster 4 Supra-territorial Regulations
Cluster 4 represents a cluster of applications, which we describe by strong AI ethical agency, high moral intensity, and the widest dispersal of accountability between actors. As a result, accountability clusters are likely to be fluid and complex, requiring ongoing discourse between designers, users, organizations, industry, regulatory, and thus, supra-territorial regulations. Multiple actors may compromise clusters of multiple actors with a range of alternate vested interests (e.g., national and regional government, national and international law, civil society, industry bodies and corporations). Supra-territorial regulations refer to the need for collaboration between individual and organizational actors at an international level. It is at this layer that we might have situations that result in illegal decisions, practices, and outcomes. An emphasis on strengthening and overseeing regulatory mechanisms at the highest level (e.g., including international legal apparatus, media, and civil society) might be necessary to complement more local and regional mechanisms. Examples of AI robot applications falling into this category could involve certain health and care services, and military application contexts, especially where there are a high number of affected people and/or the nature of the application implies significant vulnerability.
The high level of accountability dispersal does not imply that the AI robots ‘usurp’ the role of ethical human decision-making but it is becomes increasingly difficult to attribute AI robots’ acts to specific individuals or organizations. If not managed properly, illegal use of AI robots is likely to occur in this group. AI robots in this group can be increasingly autonomous. An example for very high moral intensity, where also the perceived locus of morality falls far from individual human beings, is using Lethal Autonomous Weapon Systems (LAWS). The operational autonomy is very high with very little to no human involvement. The agreement about avoiding or allowing certain use of these AI robots is international in nature as the nature of defense typically is (apart from internal conflicts). Besides LAWS there are other considerably autonomous AI robots such as military drones and Big Dog (Lin et al., 2014) that is considered as a carrier of military equipment instead of an attacker robot, yet still in support of war effort. Highly autonomous AI killer robots make decisions on their own—we could consider their manufacturers as facilitators but according to Byrne (2018), not as murderers themselves. Intergovernmental regimes are required to collaborate to hinder the illegal use of military AI robots. Depending on the regulatory settings, the use of LAWS is typically illegal. In the absence of legal prohibition, they may be immoral. Using other military AI robots may be permissible (e.g., for self-defense purposes) and even supererogatory (e.g., to save lives in a natural disaster).
Driverless cars are another example of autonomous AI robots, where the locus of accountability is not primarily with the human behind the wheel. The driverless car sets out with a program that incorporates speed limit guidance but learns that other cars exceed the limit and concludes that it should speed too. Tragically, there were different incidents where a Tesla car traveling over the speed limit resulted in deaths (Etzioni & Etzioni, 2017). This caused Tesla to further examine its autopilot driving system. The nature of the regulatory environment for driverless cars is increasingly international as they are becoming an inherent part of international mobility. While country-specificities relevant for driverless vehicles may apply (e.g., the lack of a speed limit on the German highway (the Autobahn), there is an increasing need for consistency at a supra-territorial level (similar to permitting using EU driving licences in any of the member states). Further, while using AI robots in care homes can increase elderly life quality (Broekens et al., 2009), it also implies some risks, especially where it is unclear who should ‘supervise’ these AI robots or when assigned supervisors neglect checking on the AI-enhanced care robots and the compliance of their use with international health and safety standards.
Dynamic Contextual Factors
There are some dynamic factors that we need to highlight as caveats to how to interpret Fig. 1. The dimensions we suggest will be subject to movements over time depending upon the actual context-specific application. The key trends that may influence applications of the framework include the changing state of technology, for example, with machines varying in terms of the extent of human imitation they possess (analysis-intuition). Another factor is the relative labor/skill displacement (Wright & Schultz, 2018), where certain (low skill) workplaces are potentially decimated depending on the level of imitable specialisms in the workforce. Reduced employment opportunities and AI-supported warfare among countries trigger equality-related concerns as well. “Unless policies narrow rather than widen the gap between rich, technologically-advanced countries and poorer, less-advanced nations, it is likely that technology will continue to contribute to rising inequality” (Wright & Schultz, 2018, p. 829). Finally, there is a factor of unknown outcomes. For example, currently we do not know whether AI robots will make better moral decisions than humans, or more consistent ones. We also do not know if they may be able to redefine the moral parameters independently. For instance, it appears that AI has the potential to both reinforce and reduce racism (Noble, 2018). AI robots can learn, for instance, swear words and bullying behaviors (Dormehl, 2018). While such behaviors may not be illegal, we can regard them as immoral and thus, policymakers should support developing appropriate monitoring mechanisms. It is noteworthy that although attention is diverted from the technical innovativeness of AI robots towards corners around their interaction with humans, AI robots do not develop immoral behaviors themselves but learn those from humans, for instance, through pre-programming or the imitation of humans.