1 Introduction

AI is presently being used in virtually every domain, including in the military. There are currently many AI applications in the military context and their adoption is accelerating, including those requiring ethical evaluation and judgement. One application that is receiving a lot of attention is (assisted) decision making systems for targeting and engagement purposesFootnote 1, particularly lethal autonomous weapon systems (LAWS), and questions arise about how such systems should be regulated and how International Humanitarian Law (IHL) should deal with them [1,2,3,4,5]. Alongside the focus on IHL, another critical issue is the allocation of responsibility for actions taken by these systems. This concern is prominent in both international debates addressing the governance of military AI and academic discussions. A significant concern in ethical debates revolves around the attribution of responsibility for these systems in the case something goes wrong, commonly referred to as ‘responsibility gaps’. This term, introduced by Andreas Matthias in the context of autonomous machines (2004) and later applied to LAWS by Robert Sparrow [6], continues to be a focal point of discussion in both philosophical and legal realms [7,8,9]. In the context of LAWS, responsibility gaps manifest when it seems appropriate to hold someone responsible for a certain (bad) outcome, but according to our standard theories of moral responsibility attribution, there is no suitable target of blame.

Although it is important to recognize that neither autonomous systems in general nor LAWS in particular are simply machines but rather sociotechnical systems involving both machine (hardware and software) components and a multitude of human actors (developers, operators, users) leading some authors to state that the responsibility gap is a many hands problem [10], the predominant focus in discussions responsibility gaps and LAWS has been on assigning backward-looking responsibility. This debate revolves around the argument that due to the growing autonomy of the system - characterized by its self-learning ability and ability to adjust behavior in response to feedback from the environment - neither the machine nor the humans involved can be held responsible. The machine, although potentially causally responsible, lacks moral agency, which means that, according to standard theory, it cannot be held responsible. At the same time, humans are considered to lack the necessary control and/or knowledge, essential conditions for assigning moral responsibility. It would be therefore unfair to hold either party responsible [6, 8]. In an effort to address this so-called responsibility gap, multiple solutions have been proposed. Among them are authors advocating for a revision of the conditions described in our standard theories of responsibility attribution. By modifying the standard conditions, it becomes feasible to hold humans (or machines) responsible.Footnote 2 Proposals in this category include to conceptually engineer the concept of responsibility differently [16]; the introduction of a ‘blank check’ liability, where humans hold themselves responsible for the actions of military robots [17], and legal proposals suggesting the adoption of strict liability in criminal law.

Responsibility attribution is generally based on the effects directly caused by an action. This becomes challenging in the context of LAWS due to their self-learning and autonomous capabilities. Multiple authors have argued against direct individual criminal liability for operators or military commanders in the context of LAWS, emphasizing that such liability requires intent or recklessness and a direct causal link between action and outcome [18]; Dickinson, 2019; Chengeta, [19] pp. 16–27; Crootof, [7] p. 1376; Egeland, [3] p. 106; Saxon, [20] p. 28). Given that neither operators nor commanders are directly involved in the execution of attacks by LAWS and lack a guilty mind due to the very limited meaningful human control, establishing direct individual criminal liability appears highly unlikely. However, an intriguing solution proposed to address responsibility gaps in LAWS is rooted in responsibility ascription within military hierarchies, specifically through the concept of command responsibility.Footnote 3 The doctrine of command responsibility is part of customary international law, as codified in art. 28 of the ICC Statute, art. 86–87 Additional Protocol I to the Geneva Conventions and art. 7 (3) of the ICTY Statute and interpreted in case law of the UN international tribunals for the former Yugoslavia and for Rwanda (among others in the Delalić case, Blaskić case, Kayishema and Ruzindana case). The doctrine holds military leaders criminally responsible for crimes committed by their subordinates, constituting a form of liability for omissions related to the acts of subordinates rather than a separate criminal offense [21], p. 18). This approach is interesting because it enables commanders to be held responsible for indirectly caused effects. Responsibility can be attributed based on the causal-like relationship (supervision) in which humans stand, allowing responsibility to follow the lines of supervision.

The solution of using command responsibility to solve responsibility gaps in LAWS has been both proposed and criticized in AI ethics and legal literature.Footnote 4 However, what has been lacking in such ethical and legal evaluations and remains under researched is the link with recent empirical psychological studies. This integration could not only lead to a better understanding of the practical implications of applying theoretical frameworks such as the doctrine of command responsibility in real-world contexts, but also advance the current ongoing theoretical debate, as empirical studies can serve to validate or challenge theoretical assumptions in the literature and identify new patterns and trends in the assignment of responsibility. Therefore, this article specifically addresses responsibility gaps within the context of LAWS and aims to examine the extent to which applying the doctrine of command responsibility as a solution to these gaps creates a moral gap. What is presented morally as well-argued and persuasive is not necessarily consistent with how this moral issue is perceived from a psychological perspective. The purpose is to anchor discussions in empirical realities rather than relying exclusively on normative arguments. The ultimate goal of this paper is to assess the extent to which the analogous application of command responsibility to LAWS is feasible and desirable.

In Sect. 1, we explore the initial theoretical plausibility of applying the doctrine of command responsibility to autonomous systems, considering factors such as anthropomorphic tendencies, perceptions of responsibility of commanders when deploying autonomous systems, and the impact of interaction with autonomous systems on moral decision-making and Sense of Agency (SoA). We suggest that commanders may be suitable candidates for responsibility. In Sect. 2 we will argue that it may not be prudent to apply the doctrine analogous based on research in traditional hierarchical settings, which suggest that neither coerced agents nor commanders experience agency over their actions, potentially leading to a genuine responsibility gap. However, in Sect. 3 we argue that this gap may not inherently pose a problem since not all normative solutions need to be grounded in descriptive factsFootnote 5, drawing parallels with traditional hierarchical settings where we generally attribute responsibility to commanders despite their apparent lack of sense of responsibility. Section 4 concludes by advocating caution in applying the doctrine to non-human agents and gives practical and theoretical reflections to demonstrate why the gap between empirical facts and ethical solutions should not be too wide.

2 Viability of applying command responsibility to artificial subordinates

In this section, we explore the viability of applying the doctrine of command responsibility - traditionally applicable to situations involving human superiors and human subordinates [CH ∧ SH] - to scenarios that include an artificial subordinate [CH ∧ SA]. The doctrine is a jurisprudential doctrine in international criminal law that aims to hold military commanders accountable for war crimes committed by their subordinates. The doctrine has been developed and applied in varied ways by ad hoc tribunals and the International Criminal Court (ICC) leading to inconsistent codification in international agreements [28], pp. 265–266). Generally, it holds superiors liable if they have actual control over a subordinate, know or have reason to know of the subordinate’s criminal acts, and fail to take necessary and reasonable measures to prevent or punish them [7], p. 1378). Ethically, the application may seem plausible, as commanders are generally considered responsible for adverse outcomes caused by human subordinates, despite these subordinates often having significant discretionary power. Moreover, the argument that the concept of command responsibility was initially formulated to regulate the interactions between humans on the battlefield is no prima facie reason for excluding its adaptation to address new challenges involving artificial entities. The application of the doctrine as a possible answer to the responsibility gap is not new. It has been defended in the AI ethics and legal literature by Himmelreich [22,23,24] among others in the context of LAWS. In this section, we argue that such an adaptation might not only make sense from an ethical or legal standpoint but also from an empirical perspective. We outline three reasons drawn from recent literature in psychology, each of which we will discuss in turn.

The first reason for considering AI as a subordinate agent is our human tendency for anthropomorphizing. Anthropomorphizing refers to the inclination to attribute distinctively human characteristics to nonhuman entities [29]. Peter Singer and Joel Garreau [30] have reported already more than a decade ago the intriguing tendency of humans to form relationships with machines. They illustrated this phenomenon by examining the behavior of US soldiers in Iraq and Afghanistan, who developed unexpectedly close personal bonds with their PackBots by giving them names, awarding them battlefield promotions, risking their lives to protect the ‘life’ of the robot and mourning their ‘deaths’ [30, 31]. This unique human-machine bonding is a result of the system’s integration within the military unit and the role that it plays in battlefield operations. Another factor is the shape of the machine and perceptions of similarity to humans, as recent research indicates a favorable effect of anthropomorphic design features on human-related outcomes [32]. A further instance of this tendency affecting people’s judgments and decision-making includes a greater reluctance of humans to sacrifice machines [33]. Thus, viewing AI as a subordinate agent is not as far-fetched as one may think, and is plausible from that viewpoint. Moreover, a growing body of psychological studies in recent years has shown that we tend to attribute moral responsibility to nonhuman agents, leading us to be willing to blame robots [34,35,36,37,38,39,40,41,42,43,44,45,46]. This tendency is particularly noticeable in scenarios where the robot is described as autonomous compared to scenarios where the robot is described as nonautonomous, suggesting that the degree to which people view them as social actors and attribute blame to them depends on the perceived degree of autonomy [47]. Worth mentioning is a recent paper by Kneer and Christen [38] p. 3), who conducted a cross-cultural empirical study using Robert Sparrow’s famous example of an autonomous weapon system committing a war crimeFootnote 6 among Japanese, German and U.S. participants. The study concluded that people show a considerable willingness to hold autonomous weapon systems morally responsible. This finding seems to contradict the hypothesis often put forward in philosophy literature that people find morally responsible machines absurd and demonstrates that people are far from dismissive of the possibility of assigning moral responsibility to a machine.

A second reason for considering the extension of the concept of command responsibility to non-human subordinates is the notion that a commander is held responsible not only for the war crimes committed by a human pilot, but also when dispatching an autonomous system. This follows from the experiment conducted by Kneer and Christen [38], in which it was clearly demonstrated that commanders were deemed equally responsible in both conditions (Japan), and even significantly more responsible when dispatching an autonomous system, in contrast to situations where human pilots were deployed (Germany and the US). This is consistent with the findings by Caspar et al. [48] indicating that explicit responsibility self-ratings were higher when the commander gave orders to a robot agent compared to when the commander gave orders to a human agent (p. 17). One possible explanation for the different levels of responsibility attribution could be that in the traditional case [CH ∧ SH], subordinates are believed to have more discretionary power compared to situations involving autonomous systems [CH ∧ SA]. Consequently, the autonomy of the subordinates influences the perception of responsibility of the commander. This may be attributed to the belief that commanders have a more active and direct role in the planning and deployment of military operations in these cases. Whether this influence is truly greater in [CH ∧ SA] situations would need to be assessed on a case-by-case basis, depending on the capabilities of the system. Nevertheless, Caspar and colleagues’ experiment in their 2021 paper appears to capture the intuition that commanders in these cases still determine the general parameters under which the systems operate, such as where, when, how and against whom military force may be used. Machines do not create tasks ex nihilo and are always restrained by hierarchical orders. In essence, the experiment supports the proposal by philosophers and lawyers that human agents can be held responsible for adverse outcomes caused by machines based on their supervisory role.

Thirdly, moral decision-making increases the SoA [49], and conscious engagement of people in morally challenging tasks does not seem to be adversely affected by the interaction with an autonomous system [50]. In preparation for explaining this, let us first get clarity on what SoA is. SoA refers to the awareness that humans have of being the authors of their actions and thus of the consequences of these actions [51,52,53]; Pyasik, Salatino et al., [54]). SoA enables us to perceive ourselves as causal agents [55] and is thus a precursor of feeling responsible for a deed. It is recognized as an important aspect of human consciousness, and it is closely related to moral responsibility [49]; Caspar, Christensen, Cleeremans, & Haggard, [56]). However, while responsibility is an explicit and social concept, SoA is often measured implicitly, among other technique with the Intentional Binding (IB) effect. The IB refers to the subjective compression of the interval between an action and its outcome observed in active, but not passive, movement: participants are asked to estimate the time interval between an action they perform and its consequences [57,58,59]. A series of previous studies have shown that the time estimation between action and outcome is a valid implicit, quantitative measure of SoA [60,61,62] and is preferable to a subjective measurement of responsibility, which is subject to social desirability and other biases, such as the self-serving bias (e.g., Blackwood et al., [63, 64]. This finding implies that a commander might be a suitable candidate to be held responsible, as the engagement and sense of agency remain intact despite the involvement of autonomous systems [50].

In summary, when we combine the three aforementioned reasons, a preliminary compelling argument emerges. Humans display a clear tendency for anthropomorphism, even exhibiting a willingness to hold robots responsible. Considering this, the SA-part of the hypothesis seems feasible. Moreover, commanders are held equally or even more responsible than human subordinates. Additionally, the fact that people engaged in morally challenging tasks are not negatively affected by interactions with autonomous systems suggests the possibility that a commander might be a suitable candidate for being held morally responsible.

3 Examining concerns in practical application and hierarchical dynamics

However, relying solely on the results of Sect. 1 is not a robust argument for extending responsibility. This is not only true from a logical reasoning standpoint but is further underscored by recent empirical research on the influence of hierarchical relationships on the attribution of moral responsibility. In this section, we argue that such an extension might not be prudent due to a potential risk of a decrease in the SoA of the commander.

Research conducted in traditional [CH ∧ SH] relationships indicates that the SoA of a commander decreases in hierarchical settings, as well as the SoA of the subordinate, so that both the commander and the subordinate feel less responsible in hierarchical settings [48, 65]. In the 2018 study, Caspar and colleagues investigated, in a commander-subordinate relationship, whether the SoA and responsibility pass from the person who receives orders to the person who gives them. In the experiment, volunteers took turns to play the roles of ‘commander’, ‘agent’ or ‘victim’ in a task where the commander instructed the agent to deliver painful shocks to the ‘victim’. They tested the implicit sense of agency but also explicitly questioned responsibility. The results showed that the SoA decreased when agents (i.e. subordinates) received orders, compared to when they freely chose which action to execute, and that SoA decreased in commanders when they commanded agents to administer the shock on their behalf, compared to when they acted on their own. In other words, the results suggest that coercive situations potentially undermine the sense of responsibility of agents and commanders, as both experience less agency over their actions and its consequences, compared to a situation in which they would act on their own.

In the 2021 study, the methodology was roughly the same. In addition, here, neuroimaging techniques (i.e., functional magnetic resonance imaging-fMRI and electroencephalography-EEG) were employed to investigate the neuronal activity (involved in SoA and in empathy for pain) in hierarchical situations. One of the interesting findings there was that a difference in brain activation could be observed between participants who could freely decide which orders to give to another agent (“free commanders”) and participants who were free to decide to execute the orders (“free agents”). The brain activity in the relevant areas linked with empathy and emotional social perception was higher among the “free agents” compared to the “free commanders” suggesting that actually performing the action is important for social cognition and that despite commanders having the same decisional power, being further away from the outcome of that action does have an influence [48], p. 14). This is consistent with the findings of comparing activation between subordinates giving orders, where it was discovered that the subordinate agent had higher brain activation in empathy related areas compared to the commanders. This again suggests that acting has a higher influence in empathic response than having decisional power [48], p. 26).

When we consider the prospect of resolving responsibility gaps through the mechanism of command responsibility and combine it with the preliminary conclusions from Sect. 1 — specifically, that it is plausible to treat the a-moral agent as a (non-human) subordinate — we might find ourselves in a situation where there is also a decrease in the SoA of the commander, similar to the traditional [CH ∧ SH] situations. This echoes the findings of Caspar et al., indicating that the true responsibility gap emerges not solely from an inability to blame the machine or the human, but is rooted in hierarchical relationships. This holds true not only in human-machine situations but also in human-human situations.

In this regard, it is helpful to return to the research conducted by Kneer and Christen [38], who specifically investigated hierarchical situations involving autonomous weapons (a-moral subordinates). One noteworthy finding was that a commander dispatching a robot pilot was consistently deemed significantly less responsible for the harm than a human pilot in traditional situations. While the commander in these scenarios was generally considered equal or more responsible than in situations with a human pilot, the attributed level of responsibility to the commander was still less compared to the human pilot. This indicates that, in the eye of the person judging the situation, not all responsibility is fully transferred from the robot pilot to the commander. These findings are consistent with the research of Bigman et al. [66] and Shank, DeSanti, et al. [67] revealing that blame attribution to AI for moral wrongdoing is less than that for humans, and humans monitoring AI are faulted less than those working in teams composed solely of humans.

These findings delve into the core of the problem, providing empirical evidence for what philosophers have theorized. In human delegation, an authoritative agent (commander) delegates tasks or competences to a subordinate with a portion of responsibility, but retains a variable portion of responsibility. When bad consequences arise from the actions of subordinates, not only are the subordinates held responsible, but a share of the responsibility is shifted to the superior if certain conditions are met. Applying this framework to situations where commanders delegate tasks or competences to amoral subordinates, such as autonomous machines [CH ∧ SA], seems similar and there is no apparent reason why part of the responsibility cannot be shifted to the superiors in cases of negative consequences caused by these amoral subordinates. However, a crucial distinction emerges: in human-to-human delegation, it remains possible to hold the subordinate responsible for the underlying actions leading to adverse outcomes. Conversely, in delegations to amoral agents, it becomes (normatively) impossible to assign responsibility to the subordinates for their actions. When we transfer responsibility for the actions of amoral subordinates to commanders, the distribution of this responsibility becomes unclear. Questions arise regarding the nature of the commander’s responsibility and what responsibilities, if any, remain with the subordinates after the transfer. A complete transfer of responsibility from machine to user appears unfeasible due to a distinction between two types of responsibility: (1) moral outcome responsibility, which pertains to the responsibility for the actual consequences or outcomes of the actions taken by autonomous machines, and (2) moral responsibility for the use or deployment, which relates to the user’s responsibility for utilizing a machine with the potential to cause unfortunate results. This distinction highlights the challenge of ‘residual responsibility’– the portion of (1) moral outcome responsibility that has not been fully transferred from the subordinates to the commander. To address the responsibility gap as the pessimistsFootnote 7 conceive it, a complete transfer of (1) would be required. However, it seems to be that commanders are responsible for (2). Consequently, a comprehensive solution to the responsibility gap still appears elusive and the gap, as pessimists envision it, seems to be inherently unsolvable because the ‘remaining’ responsibility can never be fully transferred. Nevertheless, it is worth mentioning in this context Frank Hindriks and Herman Veluwenkamp, who argue that this kind of reasoning is based on a wrong assumption, namely that the amount of blame that is appropriate in the nearby possible world is an appropriate reference point for the actual world, but that there is no good reason to think that this assumption is correct. Therefore, they argue that this ‘deficit conception’ of responsibility gaps is problematic and that there are no responsibility gaps [71], pp. 5–6).

Kneer and Christen [38] did not distinguish between the two types of responsibility in their experiments. However, they also ended up with a ‘remaining’ share of responsibility not transferred to the commanders. If all responsibility had been fully transferred from the (amoral) subordinate to the commander, their results would have shown an equal amount of responsibility allocated to the commander dispatching a robot pilot compared to the human pilot (subordinate). In summary, this not only highlights that empirically speaking the true responsibility gap seems to exist in every hierarchical relationship, regardless of being a human-human or a human-machine relationship, but also suggests that human-machine situations leave us with an untransferred share of responsibility.

4 Navigating the interplay between ethics and descriptive realities

Not all normative solutions necessarily need to be grounded in purely descriptive facts. In fact, this is rarely the case, and deriving a moral ‘ought’ from an ‘is’ (action) is a fallacy.Footnote 8 The questions we address in general in moral philosophy in relation to responsibility, differ substantially from those investigated in psychology. While moral philosophy attempts to answer ‘who (if anyone) can be rightly held responsible for harm?’, psychology explores the real human tendencies regarding assigning responsibility in such contexts. This exploration includes two distinct sub-questions, namely (1) ‘what are people’s retributivist moral dispositions?’ (outsider-view) and (2) ‘how responsible do people who are being held responsible actually feel?’ (insider-view). Both implicit measurement methods, such as the IB, which allow measuring quantitatively the SoA, and explicit methods, like asking questions or self-reports, can be used in answering these questions. However, this does not mean that both methods are equally suited to answer both questions. For example, both methods are used to measure SoA, but the first question, finding out what the retributive moral dispositions of outsiders are, seems to be answered only by obtaining a direct report on how they attribute responsibility. Preliminary outcomes from psychology research relevant to command responsibility suggest that (1) people might not tend to (fully) blame commanders for bad outcomes caused by autonomous machines, and (2) commanders themselves feel less responsible for tasks carried out by subordinates. However, these descriptive facts are not reasons to believe that commanders cannot be held responsible or that people should not hold them responsible.

Therefore, the risk described above in section two needs not in principle be a problem for using the solution of command responsibility for negative outcomes caused by AI-subordinates. The reason for this is that the responsibility gap that would arise in such a scenario is not inherently more problematic than in traditional configurations [CH ∧ SH], where we generally attribute responsibility to commanders despite their apparent lack of felt responsibility. Especially in military contexts it seems plausible that commanders do not feel responsible for adverse outcomes caused by human subordinates, yet we do hold them responsible. This is done to ensure sufficient supervision by the commander, and this intention also holds true for the AI subordinate. One example worth mentioning in this regard is the Yamashita case. General Tomoyuki Yamashita was held ultimately responsible for numerous war crimes relating to the Manila massacre and many atrocities in the Philippines against unarmed civilians and prisoners of war between 9 October 1944 and 2 September 1945, and was sentenced to death, despite the controversy surrounding the case.Footnote 9 It was argued that he was responsible for his subordinates and failed to maintain his duty as commander to control the operations and the members of his command, thereby permitting them to commit brutal atrocities.

The case was controversial because no evidence was presented to show that Yamashita had ordered the violence or that he had known about the acts, and a standard of liability set out by the military commission amounted to an objective form of liability pursuant to which a commander could be held criminally responsible for crimes committed by his troops despite the absence of control of the criminal acts of his subordinates and regardless of any awareness or knowledge on his part that such crimes had been committed [21]. Yamashita denied until the very end that there was a way for him to control all actions by all his subordinates because of a disruption in communication and command structure and that he had knowledge of the crimes committed by his subordinates. He claimed that he would have harshly punished them if he would have had that knowledge. So, in general, when we make moral and legal decisions, we do that based on normative reasoning that involves requirements. In the case of the application of command responsibility these include a relationship between commander and subordinates, the control of the commander over the actions of the subordinates and the knowledge that the commander has or should have had that a certain outcome is going to happen. As such, the reason the case was controversial was not because Yamashita himself didn’t feel responsible or that it was proved that public outsiders didn’t have a tendency to blame him, but the fact that the requirements that we generally use for attributing blame were seriously contested in this case.

In conclusion, many ethical solutions may not align with descriptive facts, and not all normative solutions need to be grounded in descriptive facts. It is incorrect to absolve people of responsibility solely because they don’t feel responsible, just as it is incorrect to argue that people feel responsible and should therefore be held responsible. So where does this leave us in in the context of applying the doctrine of command responsibility to LAWS? Based on the empirical research, the theoretical solution of command responsibility becomes more plausible. However, empirically speaking there is a serious risk of encountering impediments to assigning responsibility to AI systems, similar to traditional hierarchical situations. It would be misguided to assert normatively that commanders are responsible purely based on descriptive factors such as people (not) holding commanders responsible or commanders (not) holding themselves responsible.

4.1 Recognizing the vital role of empirical evidence in ethical frameworks

Nevertheless, caution is warranted, and a degree of reluctance to decouple ethical solutions from empirical psychological studies is advisable because of the potential unwanted risks that may arise when our ethical solutions deviate significantly from descriptive empirical facts, including psychological studies. It is desirable that the gap between the two is not too broad, as thoughtful research benefits from some level of empirical support. Furthermore, it can be pointed out that it should at least prompt ethicists to try to understand why there is such a significant gap. In this regard three arguments can be raised in favor of why they should not diverge significantly. We will discuss them each in turn.

First, for risk mitigation, descriptive psychological facts can highlight potential unintended consequences of ethical solutions. Ensuring some alignment between ethical solutions and descriptive psychological facts ensures that ethical solutions are designed to minimize negative psychological impacts and behavioral side effects. An example of this, in light of our discussion on using the doctrine of command responsibility to address the problem of the responsibility gap in the context of non-human agents, is the risk that superiors may (un)intentionally try to evade responsibility more readily when delegating tasks to machines compared to humans. This possibility introduces the concern of a false diversion of moral responsibility [73]. By categorizing artificial agents as subordinates under the doctrine of command responsibility, there may be an unintended incentive for strategic moral scapegoating. This phenomenon involves successfully shifting punishment to the artificial agent, providing superiors with a means to evade punishment more effectively when tasks are delegated to machines rather than humans. Such a scenario raises ethical concerns, as it could lead to an overuse of machines or human-machine teaming in certain situations. This overreliance, driven by a potential escape route for superiors from responsibility, may compromise decision-making processes and ethical considerations when using autonomous systems. In essence, while the theoretical underpinnings of applying command responsibility to non-human agents may appear plausible, the potential for unintended consequences and strategic moral scapegoating, as demonstrated by empirical research, necessitates a careful and thorough evaluation of its practical implications. Balancing the need for accountability with the risk of misuse and overreliance on machines is crucial in determining the feasibility and desirability of extending the doctrine in the context of artificial intelligence and autonomous systems.

Second, we can point to practical concerns. The convergence of ethical solutions with descriptive psychological facts and empirical studies strengthens the foundation of ethical frameworks, ensuring that the proposed ethical solutions are feasible and making them more applicable, acceptable, and effective in real-world contexts. This also has benefits related to informed policymaking. Decision-makers, whether organizations or governments, benefit from ethical solutions that are aligned with empirical studies, as this helps to make informed decisions that are based on realistic understanding of human behavior. In other words, to increase public trust, ethical solutions that align with psychological facts avoid idealization and acknowledge the complexities and nuances of human behavior, steering clear of overly optimistic or pessimistic assumptions that may not reflect reality. It is also important to mention that by including empirical evidence, ethical solutions acknowledge the diversity of human experiences and behaviors, promoting an ethical framework that is relevant and considerate of different human realities. If we argue for a normative theory, such as applying the command responsibility framework and blaming human commanders —in other words, imposing moral responsibility on people— we better make sure that it works and that we model and cultivate the intuitions along these lines.

Third, there is a more theoretical consideration for why the gap should not be too broad. Ethics doesn’t fall from the sky; it is developed by humans to regulate relationships. Our ethical theories aim to balance what people think and their intuitions and empirical findings on the one hand with the normative case on the other hand. The key is to strike this balance in a good way, by not solely relying on practical feasibility or falling into the trap that empirical research has the last word in everything but also by not ignoring empirical research and the reality of multiple people. Our normative theories don’t come out of thin air, and so command responsibility is already influenced by our intuitions. The concept of moral responsibility appears to be intimately linked to and shaped by the fact that we are social human beings existing within a specific moral community. This can be illustrated in the case of command responsibility. One of the reasons for the development of the doctrine of command responsibility after the Second World War, where thousands of individuals were held accountable for their actions, was the realization that came that international criminal law was underdeveloped, and a deep chasm existed between what was generally regarded as morally repugnant and the range of conducts that qualified as criminal offenses under international law [21]. There was, therefore, a general awareness of the necessity for international law to catch up with basic moral standards. Furthermore, from a theoretical standpoint, looking at empirical research is important to ensure the flexibility and adaptability of ethical solutions. As our understanding of human behavior evolves, ethical frameworks that are informed by empirical studies can evolve based on new insights. Ethical solutions should be adaptive and open to incorporating new knowledge. For example, if empirical studies reveal non-intuitive aspects of human behavior, ethical frameworks may need to evolve to accommodate these insights. These non-intuitive empirical results may prompt deeper ethical reflection and can stimulate ethical discourse, as ethicists may need to critically examine the reasons behind certain intuitions and consider whether they align with broader ethical principles or biases.Footnote 10 Ethical frameworks are not static. If empirical studies challenge intuitions, it signifies that ethical frameworks should undergo an iterative process, continually refining and adapting to emerging knowledge.

5 Conclusion

In this paper, we have closely examined the concept of command responsibility, which is often proposed as a solution to address responsibility gaps emerging in systems with increasing autonomous capabilities, particularly within the context of LAWS. By analyzing recent psychological research, we have asserted that, initially, the solution appears theoretically plausible and well supported based on preliminary empirical evidence. However, upon deeper exploration of existing empirical studies on hierarchical human relationships, we identified significant obstacles in assigning responsibility to AI systems, paralleling the challenges in traditional hierarchical settings. Nevertheless, we argue that these empirical concerns alone should not dismiss the application of command responsibility to non-human entities. Instead, we emphasize the importance of integrating empirical realities into normative discussions and solutions. This not only serves practical purposes but also enhances the ethical discourse itself. By acknowledging and incorporating empirical evidence, we can refine our ethical frameworks and develop more robust solutions to address emerging challenges in AI governance within military contexts.