Artificial intelligence (AI) robots are intelligent, semi-autonomous machines, software and systems that have the increasing ability to formulate decisions in collaboration with humans or on their own, to support humans. They enable higher efficiency and increase quality of life (Čaić et al., 2018) and so can have a profound effect on business and society. AI robots assist elderly care (Broekens et al., 2009; Jiang & Cameron, 2020), support medical diagnoses (Yoon & Lee, 2019), and ease transportation in the case of autonomous cars (Hassan et al., 2018). At the same time, however, using AI robots triggers societal changes and thereby yields ethical implications (Alles & Gray, 2020; Veruggio et al., 2016; Wirtz et al., 2018). As Westerlund (2020) suggests, AI robots may come to significantly shape the socio-political order over time, raising ethical issues and accountability concerns at the highest level, as is already the case with algorithms and personal data harvesting (e.g., the Cambridge Analytica scandal, Wang et al., 2020).

Although it is difficult to predict the pace of technological innovation in AI robots, a business and management-based discussion of AI robot ethics is necessary to mitigate future risks (Russell et al., 2015), to assist both the codifying of AI robot behavior (Gunkel, 2012) as well as developing accountability mechanisms in business settings. Although the field of business ethics is relatively new within the ethics domain, it can help guide individual and organizational users alike, who are aiming to better manage their own, and in case of organizations, their employees’ ethical behaviors (Trevino & Brown, 2004). While some users already may “have a higher level of global awareness” to act ethically (Huang & Rust, 2011, p. 44), several others need more guidance. In this paper we summarize insights from normative and descriptive ethical theory, drawing on the former’s capacity to determine actionable, categorical responses to ethical challenges of AI robots (of which we identify four categories), and the latter’s explanatory capacity in relation to ethical dimensions shaping AI robot contexts (of which we cite two—moral agency and intensity). These theoretical elements combine to provoke thinking about ‘accountability clusters’ (multitude of actors, levels, and institutions) needed to govern AI robot applications in business.

To our best knowledge, applying normative and descriptive ethics to AI robots in business settings is a novel approach, as it enables one to integrate concepts from yet unrelated knowledge domains with concepts pertaining to AI robots and the reflections on their practical matters. We use these insights to develop a new framework that incorporates the following constructs: locus of morality (human to AI agency), moral intensity and accountability dispersal, accountability clusters, and the four ethical categories of illegal, immoral, permissible, and supererogatory. Whereas the locus of morality depends on where moral decision-making lies (Kagan, 2018), moral intensity refers to the extent of issue-related moral imperative across different situations and considers the impact a single action can have on multiple victims or beneficiaries (Jones, 1991). However, we are also mindful that ethical implications of AI robots stem from a unique web of interrelationships between loosely connected actors such as AI robot designers, individual and organizational users, industry and government bodies as well as civil society groups. Therefore, one must consider new forms of AI robot accountability—which we describe theoretically as ‘accountability clusters.’ These are the networks of relevant actors positioned at different levels who constitute mechanisms for AI robot accountability (e.g., personal/professional, organizational, institutional, supra-territorial).

Regarding this important phenomenon, we broadly adhere to Beu and Buckley’s (2001, p. 65) definition: “Accountability is the perceived need to defend or justify behaviors to an audience with reward/sanction authority, where the rewards/sanctions are perceived to be contingent upon the audience evaluation of such conduct.” However, as Buhmann et al. (2019) note with regard to ‘Algorithmic Accountability,’ there may be special, discrete accountability characteristics specific to AI and system learning technologies (such as technical and strategic opacity) that render expectations of accountability as highly fluid. AI robot applications have the potential to significantly complicate traditional, formal mechanisms of accountability (e.g., workplace tribunal) by dispersing moral agency between potentially numerous agents, who may be responsible for a grievous harm, but whom are spatially, organizationally, and even temporally disconnected. Accountability dispersal, then, is the extent to which accountability spreads across different actors and levels, where high accountability dispersal poses communication and coordination challenges stakeholders face when it comes to ensuring the ethical use of AI robots. Our theorization of accountability dispersal recommends investigating inter-linkages between actors within and across levels, echoing the rich tradition of scholarly multilevel research into related topics like governance and Corporate Social Responsibility (Balakrishnan et al., 2017; François et al., 2019; Young & Marais, 2012). Accordingly, while we broadly follow a micro–meso–macro analytical pathway, our four ‘accountability clusters’ more precisely detail the nature of actors and characteristics of contexts surrounding AI robot applications. Therefore, our aim in this Special Issue of the Journal of Business Ethics is to promote the building of ‘theoretical bridges’ (Hitt et al., 2007) across levels within future business and society research into AI and AI robots. We explore—via four clusters—how different AI application contexts interact with notions of moral intensity, agency, and accountability, extracting clusters of variously positioned actors, and prompting consideration of some fundamental outcomes for accountability of AI robots’ use. Therefore, we highlight the value of applying descriptive and normative theory to a vital business–society issue.

The potential contributions of developing a new conceptual framework for AI robot accountability are as follows: first, to expand the current understanding of the ethical implications of AI robot applications, we formulate the aforementioned accountability clusters. These clusters indicate necessary actors and activities to ensure accountability, for instance, corporations that design and implement AI robots, industry, governments/regulators, and civil society organizations. Here, we define ‘accountability clusters’ as a nexus of relevant actors positioned at different levels that constitute mechanisms for AI robot accountability (e.g., individual, organizational, industrial/governmental, supra-territorial), which would serve to govern un/intended ethical transgressions in AI robot application contexts.

Second, for each accountability cluster, we integrate four normative ethical categories of illegal, immoral, and permissible from Heath (2014), who outlined them as pillars of the “market failures approach to business ethics” that is part of normative business ethics. In addition, we extend Heath’s work by incorporating the category of supererogatory use (Driver, 1992), characterized by creating excess value, to also cover potential positive implications of using AI robots. The concept of supererogatory use also is rooted in normative business ethics. Incorporating both negative (i.e., illegal, immoral) and neutral/positive (i.e., permissible, supererogatory) aspects opens avenues for the ethical investigation of innovation around AI robots that enables one to create a more nuanced understanding compared to the binary view of something that is either ethical or unethical (e.g., Bommer et al., 1987; Constantinescu & Kaptein, 2015; Khalil, 1993). This translates into moving from ‘yes or no’ ethical evaluations towards the quadrangle of ‘yes, please’ (supererogatory)/‘alright’ (permissible)/‘rather not’ (immoral)/‘not at all’ (illegal). Further, these categories represent different layers of moral intensity with supererogatory use as the least morally intense. We derive our framework from drawing on examples of AI robot uses, with a special emphasis on human–technology interaction from an ethical and regulatory perspective (Johnson, 2015; Lobschat et al., 2021; Wirtz et al., 2018).

Third, we address the scarcity of—especially the macro-level—studies on business ethics that use AI robots (with Wirtz et al., 2018 as an exception) to extend the literature in the field of normative business ethics. Finally, our approach can inform policy recommendations for regulatory bodies, firms, and individuals regarding developing and controlling ethically astute AI robots, for instance through using the framework for scenario planning, stakeholder mapping, and AI robot design. Drawing on the proposed new business ethics framework, we offer some consideration points for regulatory intervention related to the AI robots’ learning behaviors and their role in decision-making processes.

The study is structured as follows: we first seek conceptual clarity pertaining to AI robots. Then we discuss ethics, with special regard to business ethics that serves as the basis for the new framework outlined in the subsequent section, along with accountability clusters. Next, we outline ethical implications in the discussion section that includes regulatory implications. Finally, we highlight limitations and directions for future research, including research topics and research questions. The first author is grateful to Dr Sareh Pouryousefi for the helpful discussions on business ethics.

AI, Robots, and AI Robots

Conceptual clarity is required to evaluate relevant ethical implications and so the section below focuses on providing definitions of AI, robots, as well as AI robots. The three are connected phenomena and the third phenomenon, AI robots, is of special interest for this study. While we focus on AI robots, one also can apply the outlined ethical considerations to a wider range of AI applications.


AI refers to developing intelligent, autonomous systems that can perform tasks otherwise attributed to human intelligence, such as visual or speech recognition, language translation, and reacting to events in the environment (King, 2017). Thus, AI constitutes of intelligent software. Algorithms—and especially decision-making algorithms—based on machine learning techniques are inherent parts of AI (Martin, 2019). AI is commonly associated with machine programing to enable participation in human-like thought processes such as learning, reasoning, and self-correction (Benlian et al., 2019). AI spreads across a variety of activity areas such as machine learning, knowledge representation, modeling human cognition, data science, augmented reality, computer imaging, audio–visual signaling, and natural language processing, just to mention a few. The outputs AI generates include information, human–computer communication, and even physical objects (Baskerville et al., 2020). AI also increasingly supports, and in some areas even substitutes, human decision-making (Baskerville et al., 2020).


Historically, the concept of robots referred to automatic devices that perform functions ordinarily ascribed to human intelligence (Calo, 2017). Robots act upon codifiable, pre-determined goals and follow cognitive structures to adapt to their environment. They can recognize part of their environment such as physical objects or human voices (Aleksander, 2017) and carry out specific, pre-programmed actions, for instance moving objects and interacting with humans (Admoni & Scassellati, 2017). A robot’s level of control is limited: humans have permission to correct or stop a robot’s actions (Zieba et al., 2011). With their information processing capacity and domain-specific cognitive abilities, robots often exceed human performance in various areas (Ma & McGroarty, 2017). Further specifications regard robots as automatically controlled, reprogrammable, multipurpose machines, which one can either fix in place or make mobile for use (c.f. the ISO 8373 standard). Traditionally, complex IT systems that enable learning and support decision-making did not support robots. While several robots match these characteristics, the traditional approach focuses more on the physical entity of robots.

AI Robots

Advancing the definition of AI robots and continuously adjusting it to the latest level of technological development have been ongoing challenges. Among others, European legislation in the form of the Parliament’s resolution to the Commission on Civil Law Rules on Robotics (EP 2015/2103, INL) has called for an up-to-date, specific, and actionable definition that encompasses both AI and robots. For this paper, the working definition of AI robots is that they are semi-autonomous, insensate entities that exhibit behaviors of living beings and possess the abilities of learning and decision-making to facilitate human activity (based on Aleksander, 2017; King, 2017). This working definition draws on the definition of robots, with special regard to performing activities through sensing and adapting to the environment (Aleksander, 2017), as well as on the definition of AI, especially pertaining to human-like intelligence but not feelings (King, 2017). Some AI robots can have a physical representation, thus are machines with intelligent software (e.g., Nao or Pepper), while others are only virtually represented without the necessary physical representation of a specific machine (e.g., Siri or Alexa). Consequently, following Wirtz et al. (2018), scholars regard virtual AI software with the ability to learn over time and the capacity for autonomous action as an example of AI robots.

AI robots combine automation mechanisms and sophisticated learning and decision-making abilities to support humans. However, AI robots are still incapable of processing or expressing emotions and other vital aspects of human-to-human communication (Ciborra & Willcocks, 2006). Partly due to the lack of empathy, AI robots’ learning processes differ from human learning (Kamishima et al., 2018) in that AI robots require shortened learning cycles (Bilgeri et al., 2019), yielding a higher capacity for processing large amounts of information (Bera et al., 2019) and thus can reduce human workload. AI robots typically consist of an agent (that can be a physical entity such as an AI robot but also a software) and its environment (in which the agent acts and has an intelligent connection to, for instance, through sensors; the environment of an agent may contain further agents) (Choudhary et al., 2016). Although AI robots already appeared as witnesses in front of the court, they are typically, though perhaps not accurately, regarded as morally passive tools (Westerlund, 2020). Consequently, the legal standing of AI robots and their liability are still under discussion (Calo, 2017) but they can assist judges’ work when it comes to preparing background materials and assessing expert testimonies (Katz, 2013). AI robots’ decisions are becoming increasingly important and part of daily life. However, it has been an ongoing challenge to distinguish between AI robots’ decisions and the ones exclusively suitable for humans (Baskerville et al., 2020; Ciborra & Willcocks, 2006).

AI Robots and Ethics

Literature on AI robots identifies concentrated accountability around users (e.g., Buhmann et al., 2019; Westerlund, 2020), manufacturers (Bench-Capon, 2020; Buhmann et al., 2019), other organizations such as governments (Wright & Schultz, 2018) and partially AI robots themselves (Bench-Capon, 2020). Researchers have applied a variety of approaches from virtue ethics (Bench-Capon, 2020) to social contract theory (Wright & Schultz, 2018). The focus of these studies varies across several topics, between the abdication of human responsibility (Allen & Wallach, 2014) to algorithmic accountability (Buhmann et al., 2019). A common pattern is that authors characteristically point to concentrated accountability and open the discussion on the need for an ethical dimension in the context of AI robots. Table 1 presents some key points from the literature and position our study against the identified sources.

Table 1 Some key points from the literature on AI robots and the positioning of this study

Focus on Normative Business Ethics

Ethics refer to the implicit and explicit norms and principles one should follow in the absence of governmental guidelines or other external regulatory regimes (Heath, 2008). Interdisciplinary ethical analysis incorporates a wide range of managerial, economic, social, technical, and legal issues (Zsolnai, 2006) and discusses regulatory actions for mitigating ethical concerns. Business ethics is an interdisciplinary field that pertains to a range of normative issues in markets, including questions surrounding individual behavior and responsibility, organizational and institutional ethics, as well as the just design of markets, regulations, and political oversight (Norman, 2011). Normative business ethics refers to a field of business ethics that investigates how ethics can inform decision-making (Hasnas, 1998). Kagan (2018) suggests that normative business ethics involves substantive proposals about how to act, how to live, how to do business, and what kind of person to be. It identifies morally acceptable actions under given conditions and derives key regulatory/management protocols (Cropanzano et al., 2013). As opposed to normative ethics, meta-ethics (Miller, 2003) aim at delineating moral concepts and justifying moral theory without suggesting what comprises ‘right’ or ‘wrong.’ However, normative business ethics studies moral principles and develops guidance for resolving individual/institutional moral dilemmas and market design. Furthermore, normative business ethics raises questions about how to engage with non-human entities.

Within the field of normative business ethics, the literature distinguishes between virtue ethics (Hursthouse, 1999), which is primarily concerned with evaluating an individual’s inner states and the fit between actions and the character. Deontology (Alexander & Moore, 2007) is the study of duty, i.e., what moral obligation requires us to do. Consequentialism (Peterson, 2013) argues that the act’s consequences determine an act’s moral rightness. The ethical economy approach (Koslowski, 2001) argues for combining ethics and economics towards a comprehensive theory of rational action and social choice theory (Arrow, 1973). The market failures approach to business ethics (Heath, 2014) seeks to formulate normative standards implicit in the basic economic assumptions underlying the market economy’s institutional mechanisms. It states that business and innovation require different rules than ordinary morality. The market failures approach sees market competition as driving the efficient allocation of goods and services to achieve greater common good (Heath, 2011). It understands regulatory and ethical intervention as levers to correct imperfections and thereby reduce any misallocation of resources (Heath, 2014).

Within normative business ethics, we draw on the market failures approach to business ethics in distinguishing between illegal, immoral, and permissible use (Heath, 2014). We complement these pillars with another normative business ethical category, which is the group of supererogatory actions (Driver, 1992). The rationale for using normative business ethics is its connection between practice and ethically ideal scenarios. In respect to the recent developments in AI robots, disregarding practical considerations would hinder the ethical evaluation of their use. Supererogatory actions represent an extra mile from what one expects morally (Driver, 1992; Mazutis, 2014). The three ethical categories outlined in the market failures approach are as follows: (1) Illegal is any action that is against the law and regulations; (2) Immoral is any action that only reaches the legal threshold’s bare minimum. Statements such as ‘we didn’t break the law’ signal illegality and suggest discomfort with the morality system (Wilson & Series, 2002). Bardy et al. (2012) define morality as the set of prevailing behavior standards that facilitate cooperative behavior; and (3) morally permissible actions are those not requiring explanations of putative fairness or appropriateness.

In normative business ethics, and more specifically in the market failures approach to business ethics, the subject (object-specific actions) can act upon the act or without a subject. Heath (2014) discussed object-specific unethical actions such as seeking privileges by using corporate assets or gaining insider information as a result of one’s strategic position. Ethically debateable non-object-specific actions include, for example, exerting abusive behaviors. To achieve a relative balance between suggestions on what ‘should’ and ‘should not’ happen, we included the supererogatory category that is meant to encourage rather than discourage certain actions. Together with this fourth group, our proposed new framework will consist of two prohibitive (illegal and immoral) and two non-prohibitive (permissible and supererogatory) categories. As captured in Fig. 1 and the discussion, we illustrate further below how one can interpret these categories in relation to the four accountability clusters.

Fig. 1
figure 1

New framework for AI robot accountability

Extant ethical discussions typically draw on the binary logic that distinguishes between ethical and unethical acts and decisions. Bommer et al. (1987) identified different factors such as corporate goals, the juridical system, and religious or societal values that can encourage individual decision-makers towards ethical and unethical decisions. Ultimately, however, researchers classify decisions into the binary groups of ethical and unethical. Similarly, Constantinescu and Kaptein (2015) explore various drivers of behaviors and question the extent to which individuals or organizations can be made responsible. However, pertaining to the outcomes, these authors also stick to the binary categorization of ethical and unethical dealings. Khalil’s (1993) research examines ethical decision-making in the context of expert AI systems. He builds his argument by assuming that the decision-maker can choose among several actions that require evaluation as right or wrong, ethical or unethical. Khalil identifies reasons for ethical concerns. For example, expert systems lack human intelligence, emotions, values, and possess certain bias. He presents a variety of drivers, yet classifies the outcomes either as ethical or unethical. These studies provide useful insights into the underlying mechanisms and factors that influence decision-making when it comes to individual and organizational contexts. Thus, they are useful for individuals, companies, and governments to review their decision-making processes. However, we argue that from an ethical viewpoint, besides the drivers of decision-making, the outcomes also are relevant. Instead of a rather simplistic binary ethical/unethical categorization, there is space for a more refined approach consisting of the illegal, immoral, permissible, and supererogatory ethical categories. Research can benefit from having more options than the two extremes for ethical evaluations, especially when it comes to AI robots, where technological advancements and the different use of technology are diverse to the extent that their categorization into ethical and unethical categories became increasingly challenging and misaligned with practice.

A New Framework for AI Robot Accountability

The new framework can inform the ethical evaluations and subsequent action planning of managers, public policy makers, and civil society groups to better understand the implications of, and accountability responses to, AI robot applications. This is not intended as a prescriptive or static framework, given the inherent variation in application forms, movements in the state of technology and, crucially, shifts in societal expectations of what is and is not morally acceptable and legitimate (Suchman, 1995). We suggest two axial themes driving the framework—locus of morality and moral intensity—that combine in unique ways to render specific ‘clusters of accountability’ necessary for AI applications in business (Fig. 1). Figure 1 captures the increasingly dispersed nature of accountability, as an outcome of rising moral intensity and AI agency, and how this provokes different kinds of actors and levels of analysis. Our four accountability clusters correspond with the types of actors present at a particular level. Thus, in situations of concentrated accountability (i.e., low dispersal, low moral intensity, and human agency) AI robot accountability may be affected through well-defined and local actors. These could include the supplying AI designer company and the implementing company and user/s. An example of this might be deploying AI cleaning or even stocking robots within large warehouses. Contrastingly, in situations of widely dispersed accountability (i.e., high dispersal, high moral intensity, and low human agency), accountability clusters may draw in numerous, formal and informal actors across macro-institutional and supra-territorial arenas in order to provide accountability. An example of these kinds of AI robot settings could be international peacekeeping, military and/or humanitarian applications where deployments require accountability across different geo-political and legal arenas.

The Accountability Challenge of AI Robots

Figure 1 highlights our connection between the locus of morality and moral intensity in a context-specific AI robot application setting, along with the corresponding accountability mechanisms likely required (which will be discussed in more details further below). Drawing this link in relation to AI robot applications is necessary for the following reasons: to start, in a non-AI environment, if a manager or employee were to make an unethical decision or engage in an illicit practice (e.g., harassment, discrimination, deception, theft), the question of whom should be accountable is likely fairly concentrated (e.g., local/proximity to the person, department, or organization) such that the individual decision-maker—the moral agent—may be held solely responsible. The administration of any punitive or restorative accountability mechanisms also is likely to be local (line manager, human resource management, training programs, whistle-blowing procedures). Crucially, the focus and scope of these accountability mechanisms are not widely dispersed. However, introducing AI robots into such settings greatly complicates the interrelationship between the would-be wrong-doer (or unconsciously/complicit ‘wrong-doer’ for that matter) and corresponding restorative mechanisms, which the technological opacity associated with using AI robots exacerbates.

In the event of an ethical issue (perhaps in error) that causes harm directly, or facilitates harm indirectly, the question of accountability is far more complex. Is the AI robot, supporting AI system, developer, implementing organization, overseeing manager, industry regulator, or government responsible? Martin (2019, p. 129) broached this problem with regard to the question of accountability for the un/intended consequences of algorithms, whose agency varies from “simple if–then statements to artificial intelligence (AI), machine learning and neural networks.” Because of this, Martin finds (2019:130) it inevitable that “all algorithmic decisions will produce mistakes” that, if left undetected, could reproduce unfairness, inequality, and harm for different stakeholders. Thus, even if ‘Roboethics’ are designed-in, for example, to inhibit AI robots’ unethical decisions, ethical problems may emerge and persist that disperse accountability throughout a range of actors and activities beyond the initial designer. We thus anticipate something akin to ‘Algorithmic Accountability’ to occur in the context of AI robot applications and urge discussing the context-specific mechanisms for accountability that would include, but extend well beyond, the initial coding role of the AI designer or user (Table 1 suggests a preoccupation with designer-user accountability). Therefore, the next sections discuss key drivers of accountability of AI robots’ use for the business ethics and wider management studies community. Table 2 connects the ‘accountability clusters’ with the ethical categories of illegal, immoral, permissible, and supererogatory. Each intersection is illustrated with examples of AI robots’ use for different purposes and in different settings.

Table 2 Accountability clusters and ethical categories

Locus of Morality

AI robots’ use influences the extent to which the locus of morality, defined as the autonomy to choose an ethical course of action (Bruder, 2020), is mostly concentrated in human agents (weak AI agency) versus human agency being less straightforward with concentration towards AI robots (strong AI agency). Strong AI agency does not imply the lack of human agency but instead the hidden nature of human agency. Recent research from the field of ‘Roboethics’ (Leventi et al., 2017) indicates that AI robots may not only share agency with humans in application settings but may learn from them and ‘improve’ (machine learning). Advancements in robotics have led to the emergence of ‘smart robots,’ which Westerlund (2020, p. 35) defines as “autonomous artificial intelligence (AI) systems that can collaborate with humans. They are capable of “learning” from their operating environment (…) in order to improve their performance.”

Theoretically, repeating poor human decisions, not noticing certain harms or injustices or even unwittingly causing them, are all possibilities of escalating AI moral agency. In building our vertical axis (Fig. 1) we combine the notion of the locus of morality with Martin’s (2019, p. 131) assertion that “Algorithms relieve individuals from the burden of certain tasks, similar to how robots edge out workers in an assembly line. Similarly, algorithms are designed for a specific point on the augmented- automated continuum of decision-making.”

We note here that the more extreme ends of the upper continuum—AI robots as autonomous ethical decision-makers—are at present theoretical. Full autonomy is a hypothetical scenario that serves as an orientation point rather than an actual, attainable quality of AI robots. This is subject to the state of technology and social license around AI acceptance over time, and even strong AI ethical agency may well experience some ‘bounded’ autonomy. However, Westerlund (2020) informs us about the potential for a broad spectrum of AI agency, ranging from robots as passive recipients of human ethics (the object of programmed ethical codes) to highly active agents (subjects of ethical judgment). Westerlund (2020) also suggests that AI robots may become recipients and shapers of the socio-cultural ethical norms at a more macro-social level over time. Thus, the locus of morality is likely to shape both ‘local’ accountability responses, emphasizing the role of designer-user accountability solutions (Bench-Capon, 2020) as well as broader macro-social transformations that may require temporally undefined responses by a constellation of organizational, governmental, and civic agents and structures.

These considerations have informed how we structure the vertical axis in Fig. 1—the ‘locus of morality.’ In normative ethical theory, a primary assumption is that ethical decisions about the respective harm or freedom resulting from an action (right rule/best outcome) is conditional upon a rational choosing agent. Depending on the strand of moral philosophy, this could involve necessary human characteristics such as capacities for moral reasoning about rights and consequences of actions, a socially acquired sense of virtue and moral character as well as, more existentially, some innate notion of a moral impulse, empathy, and/or care for others. The latest research in business ethics suggests that managers can call upon their ‘moral imagination’ (Johnson, 1994), including both reasoning, empathy, and sheer intuition, in reaching the best possible ethical decision (Tsoukas, 2020). The current force of technological development, at least for some AI robot applications, is towards human imitation, including, especially, the way we make choices. This would most certainly include choices of an ethical nature (or outcome), ranging from compliance and imitation to, potentially, autonomous judgment. There is a moral risk of imitation, however, while “AI system is a tool that will try to achieve whatever task it has been ordered to do. The problem comes when those orders are given by humans who are rarely as precise as they should be” (Kaplan & Haenlein, 2020, p. 46).

Thus, the vertical axis captures on a continuum from human to AI robot, where the locus of moral decision-making increasingly resides. At the lower end, what others have called ‘Assisting AI,’ such as AI-assisted diagnostics, humans largely set ethical parameters for AI robots, who in all cases would make the final judgments regarding ethical decisions. Humans can program ethical codes and make any necessary adjustments. The AI robots cannot do this. The boundaries between human and AI robots as the origin (locus) of ethical decisions become increasingly blurred as we move up the continuum, where both humans and AI robots may assume different amounts of autonomy over ethical decisions. The top of the continuum represents a theoretically possible position (Westerlund, 2020), where humans have almost no part in any ethical decision-making, leaving this entirely to the AI robot and questioning, ultimately, who is accountable for an un/ethical decision executed (or not) in this context (e.g., the system, the product, the company or the government).

Moral Intensity

This leads us to the second determinant of accountability, that of moral intensity. Moral intensity of a given situation has been well documented in the descriptive ethical theory literature, most notably with Jones’ (1991) issue-contingent model showing how perceptions of moral intensity affect an individual’s decision-making. As we are concerned with how accountability clusters develop for particular AI robot application settings, we deploy moral intensity in a different way from Jones, defining it as the context-specific exigencies of vulnerability and scale that amplify un/intended consequences in AI robot application settings (i.e., the focus here is not the decision-making subject’s perception of moral intensity). As we explain later in our four clusters of accountability, moral intensity may stem from (any combination of) numbers of humans potentially effected, the vulnerability of human agents and/or the severity of current and legacy effects on a community or ecosystem. As our approach is a descriptive one, we will not discuss here how designers from deontological or teleological frameworks formulate AI robots’ decisions (see Bench-Capon 2020 for an explicit discussion on these).

In ethical theory, including normative ethics, researchers commonly emphasize the moral intensity of a decision. For example, the magnitude and distribution of consequences—both beneficial and harmful—is the task of utilitarian ethical theory. Theories of justice recognize the relative vulnerability of certain actors over others, rendering them more vulnerable to receive potential harm or have access to benefits denied. In each case, harm/benefit has a certain magnitude or intensity to the decision-making. In Heath’s conceptualization, moral intensity comes with increased interaction: “iteration of the interaction only intensifies their [individuals’] incentive to act in the same [morally questionable] way” in competitive settings (Heath, 2007, p. 360).

Overall, there is a spectrum of situations between low and high morally intensity (Jones, 1991). Factors such as time, magnitude, proximity, and distribution of consequences can play a role. An individual employee regularly taking longer rest breaks than his/her colleagues is on a different moral intensity compared to a hospital that prioritizes profits over patient safety in the long run. Using examples in our framework, we can similarly argue that a faulty AI cleaning robot, for the most part, will result in comparatively benign outcomes (getting briefly lost) than will an AI robot that fails in the correct diagnosis in health service encounters.

Accountability Clusters for AI Robots

While we have already begun to discuss AI accountability above, first it is necessary to draw a detailed link between our theoretical constructs and their corresponding accountability clusters as Fig. 1 presents. The locus and intensity of morality in AI robot settings necessitate special consideration of accountability and governance. Although concerned principally with algorithms, Martin (2019) underlines the inevitability of decision mistakes (both human and system). Intentionally or otherwise, occasionally managers will make poor decisions. The issue here, and of pressing concern for business ethics scholars, is what happens when mistakes do occur (e.g., mis-reporting a company’s financial health): “Ungoverned decisions, where mistakes are unaddressed, nurtured, or even exacerbated, are unethical. Ungoverned decisions show a certain casual disregard as to the (perhaps) unintended harms of the decisions; for important decisions, this could mean issues of unfairness or diminished rights” (Martin, 2019, p. 132).

We argue that while AI robots’ capability to collect and record information that may indicate mistakes (or causes of) is extremely powerful, the capacity for identification, interpretation, judgment, and deliberation may be correspondingly minimal. Weak AI ethical agents may not recognize unethical decisions that require correcting. Moderate AI ethical agents may imitate and repeat them (as ‘good’). Strong AI ethical agents may overlook serious harms (e.g., via lack of empathy) in pursuing other ‘good’ organizational ends. In short, AI robots’ decisions and practices that precipitate harm, inequality, and unfairness may go uncorrected over time. Thus, it is necessary for stakeholders to think seriously about accountability issues in specific AI robot settings. In response, we suggest four accountability clusters as indication of actors, resources, and activities to address accountability. In short, accountability requirements, as determined by the locus of morality and moral intensity, may be local, concentrated, and ad hoc (organizational supervisor), or widely dispersed across private, public, and civil society agents in an ongoing discourse of accountability (Buhmann et al., 2019). There will be markedly different requirements for accountability clusters between, for example, situations where the locus of morality is mostly concentrated within human agents (weak AI ethical agency) and where moral intensity is low (e.g., office cleaning) compared with situations of stronger AI ethical agency and where moral intensity is far higher (e.g., AI carers or soldiers). However, we do not provide new ethical norms (e.g., principles for managerial accountability per se) but indicate context-specific accountability clusters that may well include norm-making administrative mechanisms.

In this section, we respond to Buhmann et al.’s (2019) warning from algorithmic research that there may be special, discrete accountability characteristics specific to AI and system learning technologies (such as technical opacity) that render expectations of accountability as highly fluid and nuanced. Rather than provide a rigid typology of accountability mechanisms that managers or policy makers must follow, we interpretively develop clusters that fall loosely into the four different clusters of accountability in Fig. 1. In a sense, we demonstrate here how managers or policy makers might use our theorization in attempts to combine scenario planning around AI robot accountability. From our theorization, we developed four clusters (see Fig. 1) that delineate context-specific applications of AI robots. This enabled corresponding consideration of appropriate ethical categories. Note here that the depicted clusters are not mutually exclusive but cumulative, each of them is nested inside the other, like Russian (matryoshka) dolls, incorporating an increasing number of agents as moral intensity and agency rises. For each accountability cluster, we will also discuss the normative ethical properties of illegal, immoral, permissible, and supererogatory. We characterize the ethical dynamics and corresponding accountability clusters, providing further corresponding examples of AI applications (see sources in relation to the examples in Table 3).

Table 3 Illustrative cases of AI robot applications in different contexts

Cluster 1 Professional Norms

Cluster 1 represents a relatively local and concentrated accountability cluster, characterized by applications with low AI ethical agency and low moral intensity where questions of accountability are largely contained to a well-defined designer–device relationship. Akin to safety certification schemes for products, designers make ethical decisions and then encode them into non-reflexive task and behaviors that AI robots can and cannot do. We distinguish here between AI robots as imitators (such as certain chat bots that support booking processes), and AI robots in this context that follow simply pre-programmed codes of conduct (such as smart heating systems). Considering that designers shape the technical features and may incorporate ethical considerations, professional norms play a key role here, especially considering that designers are unlikely to receive much pressure from governments and international bodies on AI robots’ use. It is at this lowest level of agency and intensity that we would situate models mainly pertaining to supererogation. For instance, smart heating systems achieving environmentally friendly solutions without imposing significant risk on stakeholders is a good example. However, ‘low’ risk does not imply ‘no’ risk of ethical issues arising, implying that a significant degree of accountability is always required, albeit more locally administered in these settings.

Low moral intensity and a locus of morality that is closely attributable to humans characterizes this cluster. We include smart heating and cleaning systems at homes in this category: humans set the exact activity details (for instance, the cleaning route and sequence) and we consider the activity type typically harmless. Immoral or illegal use of these AI robots appears to be unlikely, thus their moral intensity is limited. The locus of responsibility is primarily with humans (e.g., to select the degree to which the property would be heated). Overall, this group represents low ethical risks, which where appropriate, require human supervision, even if these measures play more of a preventive role rather than managing previously experienced issues with certain AI robots. It is difficult to construe illegal applications of, for instance, smart heating and cleaning systems, but we cannot exclude it as a possibility that some smart systems have a reprogramming capacity for the unsolicited monitoring of someone’s private life or business matters besides their original purposes (e.g., cleaning, heating). A potential immoral application is when someone misleads others about his/her inability to attend an event due to compromised mobility and attends another event instead, with the support of a mobility AI robot. An example for permissibility in this category is that a cleaner may need to seek other work if his/her client sets up a smart cleaning system at home or workplace.

Cluster 2 Business Responsibility

Cluster 2 represents an accountability cluster characterized by moderate moral agency (with the locus of responsibility still closer linked to humans/groups of humans) within contexts where moral intensity is moderate. This could mean contexts where there are few humans or, for instance, the nature of the task poses little threat to humans or ecosystems, despite an increased level of AI agency. Interestingly, application contexts that might prove difficult or impossible for complex organic life to operate in, such as mining, deep sea, or space exploration, could invite AI robots with considerable degree of autonomy. The relative lack of ecological or human threat would likely result in a more concentrated cluster of mostly professional and/or organizational-level actors within temporally bounded moments (e.g., user organization following pre-existing industry regulations). This makes ‘business responsibility’ characteristic in this cluster, which refers to the liability of the organization that uses the AI robot. Moreover, it is in this cluster that we might see potential permissible decisions, practices, and outcomes. An emphasis on setting clear parameters for AI robots based upon organizational values, goals, mission, and codes of conduct would most likely complement designer-lead AI robot ethics.

Agricultural AI robots for insect detection are part of this group with nearly full autonomous operations that have low physical or other risks to humans. In addition, AI robots intended for weeding and seed-planting belong to this category. Similarly, repair and inspection AI robots can enter spaces that humans would struggle to reach and use sensors that accelerate the sensing capacities humans have. For example, the flexible elastomeric two-head worm AI robot imitates inch-worm motion, holds sensors that explore their environment and learn repair-work patterns. Companies can use it for repair and inspection (Tahir et al., 2018). The permissibility of such AI robots lies in that although larger groups of workers may lose their jobs, the humans who continue the work in AI robot-enhanced environments can enjoy improved work conditions (Fleming, 2019). With fewer workers in an AI-robotized environment, there is a reduced risk of workers engaging in dangerous tasks. The increased safety element represents supererogation. Unlikely potential illegal applications include causing harm to humans due to negligence or as a planned criminal act, or the even more efficient harvesting of illegal drugs with the help of AI robots. An immoral application may be to pressurize the human workforce to ‘compete’ with the AI robots’ performance carried out in different sites of an organization. The presence of a ‘supervising’ human agent may still be required to provide ad hoc and/or strategic monitoring to ensure alignment with industry codes, as well as correcting any ethical ‘blind spots’ that designers or organizations may have.

Cluster 3 Inter-institutional Normativity

Cluster 3 reflects situations of relatively high AI ethical agency coupled with relatively high moral intensity. Accountability may be relatively dispersed between actors subjected to institutional norms of, for example, a regional industry and/or national context that prompt the need for interorganizational liaising on ethical implications. Regulatory, industry, trade union, and civil society institutional actors, for example, might be present but in a national or region-specific context. Inter-institutional normativity refers to the nature of decision-making processes in which one concludes actions and outcomes to be ethically desirable. In this cluster the interaction between different organizations plays a significant role (instead of the focus on a single organizational setting). AI robot applications in this group deserve special attention to minimize occurrences that involve immoral decisions, practices, and outcomes. While we do not exclude legal mechanisms altogether (after all laws have a moral basis), we emphasize here the focus upon institutions—including certain governmental bodies—relevant for how industry uses AI robotics. We might anticipate the content of institutional norms to vary according to the geo-political contexts; however, the presence of institutional norms would likely reflect some kind of social contract to protect citizens in contexts of heightened vulnerability.

Examples in this cluster are the use of AI-supported healthcare data management systems (with the need for interorganizational liaising between professional healthcare bodies, programmers, and the government) as well as AI-supported crime-prevention systems (typically operated at the national level, even though international collaborations are increasingly important for crime prevention). Humans may maintain a strategic control of care planning and resource allocation decisions in the healthcare and crime-prevention data management systems, and then they implement these decisions in AI robots’ daily encoded tasks. Similarly, AI robots applied for social auditing that can encourage social distancing and other relevant safety measures belong to this group. Potential ethical transgressions are less likely to originate from the AI robot itself (as it is an imitator), but the lack of AI robot judgment means that any un/intended consequences of poor human decisions may go routinely unnoticed and perpetuated by AI robots. The lack of AI reflexivity could perpetuate unfairness, inequality, social exclusion or even harm, in various learning and rehabilitation environments. Strong normative institutional accountability mechanisms need to be in place to not only set the parameters for actors implementing AI robots in such settings but to measure and provide feedback upon shortcomings against a set of agreed norms. An example for when accountability mechanisms may not have been in place is when UK-wide databases of more than 400 thousand criminal records, including criminals’ fingerprint information, have been deleted (Reuters, 2020). Deleting criminal records could have occurred in the case of non-AI-enhanced systems as well, but AI robots can accelerate the speed and volume of this data-specific damage that is vital from a social security perspective. The absence of appropriate data-monitoring mechanisms raises the question of immorality at institutional levels (in this case the police). We cannot exclude entirely the possibility of illegality. For example, in a hypothetical scenario, a police officer who had access to the compromised criminal records system might want to cover up someone’s criminal activities, but this is yet to be consolidated as the investigation is under way.

Cluster 4 Supra-territorial Regulations

Cluster 4 represents a cluster of applications, which we describe by strong AI ethical agency, high moral intensity, and the widest dispersal of accountability between actors. As a result, accountability clusters are likely to be fluid and complex, requiring ongoing discourse between designers, users, organizations, industry, regulatory, and thus, supra-territorial regulations. Multiple actors may compromise clusters of multiple actors with a range of alternate vested interests (e.g., national and regional government, national and international law, civil society, industry bodies and corporations). Supra-territorial regulations refer to the need for collaboration between individual and organizational actors at an international level. It is at this layer that we might have situations that result in illegal decisions, practices, and outcomes. An emphasis on strengthening and overseeing regulatory mechanisms at the highest level (e.g., including international legal apparatus, media, and civil society) might be necessary to complement more local and regional mechanisms. Examples of AI robot applications falling into this category could involve certain health and care services, and military application contexts, especially where there are a high number of affected people and/or the nature of the application implies significant vulnerability.

The high level of accountability dispersal does not imply that the AI robots ‘usurp’ the role of ethical human decision-making but it is becomes increasingly difficult to attribute AI robots’ acts to specific individuals or organizations. If not managed properly, illegal use of AI robots is likely to occur in this group. AI robots in this group can be increasingly autonomous. An example for very high moral intensity, where also the perceived locus of morality falls far from individual human beings, is using Lethal Autonomous Weapon Systems (LAWS). The operational autonomy is very high with very little to no human involvement. The agreement about avoiding or allowing certain use of these AI robots is international in nature as the nature of defense typically is (apart from internal conflicts). Besides LAWS there are other considerably autonomous AI robots such as military drones and Big Dog (Lin et al., 2014) that is considered as a carrier of military equipment instead of an attacker robot, yet still in support of war effort. Highly autonomous AI killer robots make decisions on their own—we could consider their manufacturers as facilitators but according to Byrne (2018), not as murderers themselves. Intergovernmental regimes are required to collaborate to hinder the illegal use of military AI robots. Depending on the regulatory settings, the use of LAWS is typically illegal. In the absence of legal prohibition, they may be immoral. Using other military AI robots may be permissible (e.g., for self-defense purposes) and even supererogatory (e.g., to save lives in a natural disaster).

Driverless cars are another example of autonomous AI robots, where the locus of accountability is not primarily with the human behind the wheel. The driverless car sets out with a program that incorporates speed limit guidance but learns that other cars exceed the limit and concludes that it should speed too. Tragically, there were different incidents where a Tesla car traveling over the speed limit resulted in deaths (Etzioni & Etzioni, 2017). This caused Tesla to further examine its autopilot driving system. The nature of the regulatory environment for driverless cars is increasingly international as they are becoming an inherent part of international mobility. While country-specificities relevant for driverless vehicles may apply (e.g., the lack of a speed limit on the German highway (the Autobahn), there is an increasing need for consistency at a supra-territorial level (similar to permitting using EU driving licences in any of the member states). Further, while using AI robots in care homes can increase elderly life quality (Broekens et al., 2009), it also implies some risks, especially where it is unclear who should ‘supervise’ these AI robots or when assigned supervisors neglect checking on the AI-enhanced care robots and the compliance of their use with international health and safety standards.

Dynamic Contextual Factors

There are some dynamic factors that we need to highlight as caveats to how to interpret Fig. 1. The dimensions we suggest will be subject to movements over time depending upon the actual context-specific application. The key trends that may influence applications of the framework include the changing state of technology, for example, with machines varying in terms of the extent of human imitation they possess (analysis-intuition). Another factor is the relative labor/skill displacement (Wright & Schultz, 2018), where certain (low skill) workplaces are potentially decimated depending on the level of imitable specialisms in the workforce. Reduced employment opportunities and AI-supported warfare among countries trigger equality-related concerns as well. “Unless policies narrow rather than widen the gap between rich, technologically-advanced countries and poorer, less-advanced nations, it is likely that technology will continue to contribute to rising inequality” (Wright & Schultz, 2018, p. 829). Finally, there is a factor of unknown outcomes. For example, currently we do not know whether AI robots will make better moral decisions than humans, or more consistent ones. We also do not know if they may be able to redefine the moral parameters independently. For instance, it appears that AI has the potential to both reinforce and reduce racism (Noble, 2018). AI robots can learn, for instance, swear words and bullying behaviors (Dormehl, 2018). While such behaviors may not be illegal, we can regard them as immoral and thus, policymakers should support developing appropriate monitoring mechanisms. It is noteworthy that although attention is diverted from the technical innovativeness of AI robots towards corners around their interaction with humans, AI robots do not develop immoral behaviors themselves but learn those from humans, for instance, through pre-programming or the imitation of humans.


Theoretical Implications

This study offers a new framework for AI robot accountability that conceptualizes AI robots’ ethical implications along the dimensions of locus of morality and moral intensity. Considering that an AI robot may have a potentially high degree of decision-making discretion (much like a human employee—after all, imitation and learning from humans is among the goals of AI robot designers), in the event of an accidental error, misconception or even a well-intended misinterpretation of the data-response that causes harm directly or facilitates harm indirectly, the question of accountability may be widely dispersed (among for instance the developer, maintainer, the implementing organization, overseeing manager, informed by industry norms and regulations). We argue that this significant ethical deviation in accountability needs acknowledgment and exploration in a context-specific way. This study identifies accountability clusters that we can characterize by different concentrations of accountability—without the provision of new ethical norms (e.g., principles for managerial accountability)—that can inform norm-making administrative mechanisms.

The study draws on normative business ethics, especially the market failures approach to business ethics (Heath, 2014), when it comes to describing moral intensity. Heath’s work revisits the unanswered questions of organizations’ ethical responsibilities, and which considerations management should consider to ensure ethical operations. It is noteworthy that the market failures approach to business ethics has grown from the heated debate between shareholder and stakeholder theories (Young, 2015). Milton Friedman states that the responsibility of business is to meet shareholders’ need by increasing profits (Friedman, 1970). On the contrary, stakeholder theory argues that the firm’s goal is to act in the interests of all their stakeholders, not only in their shareholders’ interests (Freeman, 1994). Heath (2014) claims that organizations should avoid distorting competition by focusing only on profit maximization. The ethical categories of illegal, immoral, and permissible use stem from Heath’s conceptualization on organizational action.

However, Heath’s market failures approach to business ethics distinguishes only between acts that need to be prohibited either by law (illegal) or by following moral standards (immoral) or can be allowed (permissible). It lacks suggestions on what organizations should encourage to exceed minimum ethical requirements. Thus, this study extends the conceptualization into the broader family of normative ethics and integrates the ethical category of supererogatory use (Driver, 1992; Mazutis, 2014) to be able to offer insights not only on restrictions but also encourage certain development. Finally, this study views Heath’s primary focus on firms as a limitation of the market failures approach because the ethical status of any occurrence appears to be dependent exclusively on companies. Besides firms, individuals (Soares, 2003) and governments have ethical responsibility too, as it has been highlighted in debates on the ethical implications of environmental problems (Fahlquist, 2009). Thus, this study broadens the conceptualization from a corporation-focus in a way that encompasses the ethical standing of individuals and the military.

The proposed framework has the potential to trigger further academic discussions on moral accountability and moral intensity and advances knowledge through the systematic combination of the two phenomena for AI robots’ use. Regarding moral intensity, we consider illegal, immoral, permissible, and supererogatory use that encourages a non-binary approach towards the ethicality of AI robots’ use. The position of certain examples of AI robots’ use across the outlined clusters is fluid over time. For instance, driverless cars would already fall under the umbrella of supra-territorial regulations but are still rather close to interinstitutional normativity where they would have been located a decade ago. There is a time perspective to how the clusters develop because with time the stakeholders’ position may change in reflection to ethical, social, and environmental matters (Longoni & Cagliano, 2018). Further, the group of stakeholders relevant for certain AI robots’ use may widen or shrink, including designers, individual/organizational users, governments as well as intergovernmental regimes. Finally, Vogel (1992) acknowledges that the harmonization of ethical standards across different groups, regions, and countries is very slow, especially compared to the pace of technological innovation.

Regulatory Considerations for the Ethical Use of AI Robots

The challenge of moral intensity and moral accountability in using AI robots, if the pervasive nature of AI continues, creates a second challenge of how regulators should act to move towards an ethically appropriate application of AI. Regulators could use the proposed framework to inform policymakers’ discussions on morality and accountability in relation to AI robot use cases. For instance, based on relative moral intensity and the locus of morality, policymakers can situate certain AI robot uses across the outlined clusters—as part of scenario planning—to review accountability dispersal and its future potential development. In addition, the framework can help in stakeholder mapping that identifies stakeholder groups that should be included in relevant conversations currently and in the future. AI robot design experts may also benefit from using such a framework to enhance their ethical considerations and explore ethical possibilities in specific contexts. As the position of an AI robot use is not fixed but can move in between clusters, regulatory considerations could reflect on this. For instance, imitation-driver chatbots used even only in booking settings may “qualify” for a morally more intense cluster if the chatbot learns and applies bullying. The AI-enhanced data management issues of criminal records, for instance, may well be situated in the cluster described by interinstitutional normativity. It can evolve into the supra-territorial regulations with time, especially with the increasing need for sharing crime-prevention practices at an international level. The move between clusters of accountability requires including different stakeholders that the proposed framework can be helpful to support developing an in-depth understanding and relevant reflections.

We offer some consideration points for regulators and other decision-makers for ethically using AI robots regarding accountability and in reflection to moral intensity. In doing so, one should pay special attention to learning mechanisms and decision-making processes as both are vital for developing and using AI robots (Baskerville et al., 2020; Benlian et al., 2019). Regulators should pay special attention to not only what AI robots should learn but also to what they should avoid learning. In the case of driverless cars, for instance, learning algorithms should include restrictions to learning dangerous learning behaviors. While decision-making is almost fully automated, humans should have the option to revert to a less autonomous mode but in a transparent way, i.e., it should be tracked when a human is only a passenger in the driverless car and when they act as drivers, so that accountability remains clear, and humans cannot blame their wrongdoing on AI if they cause an accident. There are certain AI robots in this category that should be prohibited from use, for instance, LAWS. The implications of using LAWS impose high risk on people’s lives, including civilians. Developing these autonomous weapons is beyond international peace agreements, yet some countries have invested in their development. A major ethical concern is that humans barely have any chance against highly accurate and intelligent killer robots (Byrne, 2018). Applying originally military AI solutions in non-military settings, for instance, for lifesaving in emergency situations (e.g., identifying and saving people and animals in the event of a major flood) should be further encouraged as these fall under supererogatory applications.

We suggest planning with a regular audit of the applied AI learning mechanisms because they may require updating with the innovation of medical procedures. The reliance on historical data has its limitations in formulating optimized solutions and we should acknowledge this at a regulatory level. For decision-making, approval seeking mechanisms from responsible contact persons should be arranged. Relying on AI-suggested healthcare and social care solutions without expert approval holds high ethical risks. However, AI robots can utilize historical patient data to inform healthcare and social care decisions and are exempt from biases. Note that risk assessment is important as it allows organizations (vis-á-vis hospitals) to find the acceptable balance between safety and avoiding dehumanization. Further, given heightened stakeholder precariousness and the desire for managers to fill short-term labor gaps, government level controls are necessary to prevent organizations from misapplying AI. This would ensure a portfolio/needs-based approach where AI capabilities match with human needs.

Learning mechanisms should include some restrictive features, such as processes that can harm human safety (including work practices and food safety), which one should avoid and, in some cases, even unlearn. For decision-making, management should monitor and review periodically the AI robot’s role in decision-making to further improve situational response. Comparing outcomes with and before/without using AI robots is advisable unless the comparison compromises safety. For instance, one should still check regularly AI robots working in insect identification and elimination to ensure that they meet human safety regulations, even if we typically consider these AI robots as benevolent towards humans.

One should monitor and review regularly the AI robot’s learning mechanisms to avoid potential ethical concerns, e.g., pertaining to privacy and data management. These issues are not entirely new but can be strengthened through using AI. Likewise, management should periodically review the level of AI decision-making, even if humans in this group exercise control. A challenge is to ringfence time for these evaluations when there is low ethical risk overall. However, awareness helps because low likelihood can still incur high-impact events, for instance, privacy and data management practices. For example, in a hypothetical scenario, a cleaning AI robot can collect confidential data from the building in which it is used, thus raising organizational/national security concerns.

Limitations and Directions for Future Research

This study follows Western ethical standards and consequently alternative explanations may apply internationally or even across different Western countries. Similarly, the variability of legal frameworks across countries may vary in an international regulatory context. The examples we used in this study derive from regulatory debates, court appeals and from previous papers and not from primary data collection. While we see this as a potential limitation, and thus suggest further empirical investigation, the ethical nature of our study justifies using extant cases. Ethicists differ considerably in their approaches to using empirical data. Theorists argue that the empirical branch of business ethics lacks thorough theoretical grounding due to the focus on data rather than reasoning (Doorn, 2010). However, pragmatically oriented ethicists argue that empirical descriptions, in the form of considering the pragmatic conditions, are vital parts of applied ethics and that there is a need to fill the gap between ethical principles and guidance of action with empirical considerations (Birnbacher, 1999). This is to improve the argumentation and to increase the research applicability into ethics in support of everyday judgments and decisions. However, even among the pragmatically oriented ethicists the general rational is to bring in empirical insights through various ways and incorporate them in a reflective manner (Van Thiel & Van Delden, 2010) rather than necessarily having to collect primary data—for instance, through interviewing patients and medical professionals who use AI robots for increased healthcare outcomes—to engage with ethical thought. Although we center attention on AI robots in this study, which simultaneously represents a focus and a limitation, researchers potentially can extend the presented new framework to the ethical investigation of a wider group or specific groups of AI applications, such as AI robots’ use in healthcare, educational settings, or the energy sector.

Table 4 identifies several potential future research topics, and related research questions. The illegal/immoral/permissible and supererogatory normative ethical categories, for instance, are worthy of further research. This future investigation could include the study of moral responsibility, for instance, by assessing how responsibility evolves across different cases and different levels of automation. Learning and decision-making mechanisms are highly relevant and so future research could explore how organizations should manage them in an ethically responsible manner. At a more applied level, fellow researchers could further refine ethical implications specific to using AI support systems in which AI robots are embedded.

Table 4 Future research agenda for the ethical use of AI robots