1 Introduction

Bridging the gap between theory of AI ethics and the practical design of machine learning (ML) systems poses a considerable challenge [26]. How do we get from the ‘what’ of AI ethics to the ‘how’? One prominent approach involves translating ethical principles (e.g., fairness and transparency) into practical frameworks, resources, and tools that responsible agents (e.g., ML developers, decision-makers, and other stakeholders) could use to ensure that ML systems act according to the relevant principles (“value alignment”) [5, 26]. This approach is prominent in numerous subfields of AI ethics that are centered on defining and implementing principles such as fairness, transparency, and accountability. So-called fair ML research is a prominent example, proposing numerous practical frameworks and technical methods for aligning ML systems with principles such as non-discrimination, equality, and justice [2]. To support value alignment, the field has produced numerous evaluation metrics [22, 24], computational techniques [22], and software toolkits [3, 11, 20, 23] that responsible agents could use to detect and mitigate unacceptable bias in ML applications in real-life settings.

However, the practical application of AI ethics frameworks for purposes of value alignment has been hindered by what is here called the action-guidance gap. This gap refers to cases where a responsible agent is unable to use the preferred framework as a “guide” for aligning the ML system with the framework’s principle(s). This gap is evident in fair ML practice, as numerous studies have identified challenges pertaining to the real-life application of fair ML frameworks and software toolkits. On the one hand, the content of existing fair ML frameworks is often highly abstract, leaving responsible agents without clear and specific prescriptions about how to evaluate and ‘implement’ fairness in ML systems [8, 16, 20]. On the other hand, various practical obstacles, such as restricted access to required evaluation data, information gaps, and skill deficits in ML development teams, also prevent agents from applying fair ML frameworks and related tools correctly [16, 20, 28, 35]. These and other problems have led some to argue that available fair ML frameworks “do not offer sufficient practical guidance” for responsible agents to address unfair or inequitable biases in ML systems [8, p. 60]. This sentiment notably aligns with broader concerns that principle-based frameworks AI ethics simply “are not specific enough to be action-guiding” [25, p. 503; see also 37].

The concept of action-guidance has thus far received insufficient attention in AI ethics research (see, however, [29]). This article seeks to remedy this issue, drawing on philosophical debates in moral and political philosophy that revolve around the question: when are moral principles action-guiding to agents? [1, 6, 10, 27, 32,33,34]. The article provides a twofold contribution: first, it outlines a conceptual account of action-guidance. This is done in Sect.2, which discusses the concept of action-guidance, introduces key conceptual distinctions, and outlines necessary conditions for action-guidance. The proposed account clarifies the nature of action-guidance and also provides a conceptual basis for analyzing and addressing problems with action-guidance in the domain of AI ethics. Second, the article demonstrates the applicability of the proposed account of action-guidance, centering on the practical application of fair ML frameworks qua strategies for ensuring ML models’ satisfaction of formal criteria for model fairnessFootnote 1 as a general case example. Section 3 provides an overview of standard fair ML frameworks and related tools for evaluating, improving, and selecting ML models, and briefly discusses some problems that have been identified in regard to their real-life application. Drawing on the necessary conditions of action-guidance outlined in Sect. 2, Sect. 4 specifies so-called construct requirements and agent requirements for action-guidance in fair ML practice. The section proceeds to analyze previously identified problems with action-guidance in this context, construing them as violations of the specified requirements. Broadly speaking, the analysis suggests that violations of action-guidance occur when fair ML frameworks contain ambiguous constructs (e.g., underspecified criteria for model fairness), when the framework is not applied correctly by users (e.g., the fairness criteria are applied incorrectly), or when applying or following the framework correctly is strictly not possible to the user (e.g., satisfying the frameworks’ criteria for model fairness is impossible).

The article concludes with a brief discussion of implications for fair ML and AI ethics more broadly. In line with findings from previous philosophical and empirical works (see [8, 16, 20, 28]), the article recommends improving and further specifying the content of available fair ML frameworks (e.g., by defining their intended scope and including instructions on their proper application) and promoting users’ capacity to implement available frameworks in real-life settings (e.g., by ensuring users’ access to information, skills, and practical resources required to evaluate and improve model fairness). The final section also presents three ways in which the proposed account of action-guidance could be used in other (sub)domains of AI ethics.

2 Action-guidance in moral and political theory

It is a common contention that moral theory should not only explain what is right or good but also be usable by agents for purposes of acting morally [27, 32, 33]. This contention has spurred debate on what makes moral theories—or, more specifically, the principles or codes of ethics that they propose—action-guiding to agents [1, 6, 10, 27, 32,33,34]. The debate continues vibrant in the absence of strict consensus regarding the meaning of ‘action-guidance’ and its constitutive conditions. Prevailing perspectives converge on some key notions, nevertheless, providing a basis for the present examination action-guidance in fair ML practice and AI ethics more generally. This section discusses some general points of agreement regarding action-guidance ( Sect. 2.1) and outlines a set of necessary conditions for action-guidance (Sect. 2.2).

2.1 What is action-guidance?

Most theorists agree that action-guidance consists in the “successful offering of guidance for […] action, not the actual guiding of action” [34, p. 531]. This excludes the idea that a principle (or a set of principles) guides an agent’s action if and when it causes the agent to act morally or when it prescribes an action with which the agent de facto complies. Instead, the common position is that action-guidance obtains when the “agent does what a normative theory tells her to do because she correctly recognizes that this is what the theory tells her to do” [10, p. 227]. This is often taken to mean that a normative theory’s principles should be usable to agents as decision-making procedures or strategies for compliance [27, 32, 33]

Moreover, it is evident that principles can guide action without providing a correct account of goodness or rightness, and vice versa. An agent can be guided to ϕ by a principle that prescribes ϕ-ing, but it does not follow that ϕ-ing is actually the morally right action. Conversely, the moral rightness of the agent’s ϕ-ing does not imply guidance by a given principle. The applied principle’s consistency with genuine moral demands is distinguished from its capacity to guide action. This implies that “[p]rinciples that offer equal guidance can still be better and worse than each other” in terms of capturing what morality requires and that “principles that don’t offer guidance can be better and worse than each other” in the same respect [29, p. 387].

A further point of consensus is that failures of action-guidance can stem from deficiencies in the applied principle or the agent’s inability to apply it correctly (or both). On the one hand, “the principle itself may suffer from defects that prevent its practical use, or several principles that together comprise a moral code may jointly suffer such defects” [33, pp. 370–371]. On the other hand, the agent may lack “the beliefs and abilities needed to derive a prescription […] and act in conformity with that prescription” [27, p. 81]. This suggests what I will call the Dual Requirement Thesis: action-guidance obtains between the principle(s) and the agent when certain requirements are met both by the principle(s) and the applying agent. The Dual Requirement Thesis is exemplified, for instance, by Richard North’s statement that principles of justice must be “capable of delivering a determinate set of coherent and consistent verdicts about the justness or unjustness of actions across a range of cases” and the agent must have “the ability to identify the verdicts that a principle delivers on the justness or unjustness of each of the actions that are available to her” [27, p. 79].

Philosophers also tend to distinguish between two modes of action-guidance: (i) direct or prescriptive action-guidance and (ii) indirect or orienting action-guidance (see [6, 27, 34]). Suppose a doctor ponders whether it is permissible to perform a life-saving surgery on a dying patient who does not consent to the surgery. The doctor consults a set of principles T1 which states that the duty to respect a patient’s autonomy overrides the duty of beneficence, and so derives the prescription that performing the surgery is impermissible. Here, T1 is action-guiding to the doctor in the direct sense: it specifies the deontic status of the actions available to the agent by stating whether those actions are permissible, impermissible, or obligatory. Now, suppose the doctor instead consults another set of principles T2 which states that, in ideal circumstances, the doctor ought to fulfill both duties (respect the patient’s autonomy and perform the life-saving surgery). T2 offers the doctor indirect guidance: it guides them into the general direction of an ideal outcome or target by depicting the best possible state-of-affairs.Footnote 2

2.2 Necessary conditions for action-guidance

Philosophers have sought to explicate necessary conditions of action-guidance (e.g., [27, 33]). The Dual Requirement Thesis motivates a distinction between two kinds of conditions proposed in existing work—namely, what I will call (1) Construct Conditions that ensure the action-guiding capacity of the principle(s) and (2) Agent Conditions that specify what kinds of information, abilities, and resources agents must (be able to) possess to ensure correct application. The conditions are summarized in Table 1 and detailed below.

Table 1 An account of the necessary conditions of action-guidance (see [6, 27, 32, 33] as distinguished into Construct Conditions (CC) and Agent Conditions (AC)

2.2.1 Explanation

Normative principles explain what makes actions or outcomes good or right (or bad or wrong). The extent to which principles execute their explanatory function has implications for action-guidance. For example, North argues that “unless a principle [of justice] correctly specifies what makes political institutions just or unjust, it will fail to guide an agent’s reasoning in a way that is consistent with what justice requires” [27, p. 78]. This means that the principle(s) must track features of acts or other objects of moral appraisal that bear on their deontic status, and so explicate grounds based on which agents should evaluate those acts or objects’ (im)permissibility or obligatoriness.

2.2.2 Consistency

Upon correct application, the principle(s) should deliver identical verdicts for identical cases (“treat like cases alike”) [27, p. 78]. If defective, the principle(s) will map acts or objects that are identical in their good- or right-making properties into distinct deontic statuses.

2.2.3 Determinacy

The principle(s) should unambiguously map each act or object onto a single deontic status. For example, when principles of justice are applied to evaluate an available allocation of social goods, the resulting evaluation should indicate whether that allocation is (im)permissible or obligatory (see [27]). Otherwise, even the correct application of the principle(s) can result in indeterminate evaluations due to lacking what Naima Chahboun calls empirical determinacy [6, p. 553]. Recalling the two different modes of action-guidance, it should be noted that whether principles are action-guiding in the direct or indirect sense in a given case depends on the set of actions that are actually feasible to the agent.

Note also that moral theories (or ethics frameworks) can include multiple principles which may trade off upon implementation [27, p. 78]. To provide determinate evaluations of deontic status, therefore, the broader theory needs to tell how its principles relate to each other, for instance, by providing a priority ordering that determines how tradeoffs should be resolved. This is what Chahboun calls normative determinacy: an action-guiding theory is explicit about whether and when some principles override or ‘trump’ others [6, p. 553]. Furthermore, some theories allow for incommensurability between distinct principles. In such cases, it becomes necessary to explicate whether and under what conditions the operative principles are incommensurate.

2.2.4 Scope

Scope refers to the domain or range of cases to which a set of principles is intended to apply. From the perspective of action-guidance, the intended scope should be clearly stated and neither too narrow nor too broad. On the one hand, “narrowly drawn [principles] will avoid the problem of indeterminacy but only at the cost of delivering verdicts on a very small number of actions” and are thereby “liable to have nothing to say about most if not all of the actions available to an agent” [27, p. 78]. On the other hand, they ought not be expected to provide verdicts on cases outside their scope. For instance, principles of medical ethics are not directly applicable to romantic relationships, nor should we expect them to be.

2.2.5 Coherence

North notes that “to function as a decision-making procedure a principle of justice needs to deliver verdicts that recommend a coherent course of action” [27, p. 78]. This notion is captured by the Coherence Condition which states that principles should prescribe actions that cohere with empirical facts about the world. A principle that violates this condition recommends actions that violate laws of logic or physics, for instance, and hence requires agents to perform actions that are inconceivable or considerably overdemanding.

2.2.6 Constituency

Moving onto Agent Conditions, an agent must know (or be reasonably expected to find out) whether and when they should comply with the principle(s) [27]. Though the principle(s) will apply (or not) regardless of agents’ knowledge, action-guidance requires being informed in this regard. The agent’s awareness of the applicable scope of the principle(s) is closely tied to the Scope Condition: to be able to apply the right principle(s) in the right cases, the agent should be aware of the applicatory scope of the principle(s).

2.2.7 Comprehension

The agent must possess (or be reasonably expected to acquire) the beliefs required for deriving a prescription that is consistent with the applied principle(s) [27]. This includes beliefs about the right-making features determinative of the deontic status of actions or objects as well as their specific descriptive properties (see Explanation above). Recall the doctor who should respect patient’s autonomy by gaining their informed consent to surgery. To fulfill their duty to respect patient autonomy, the doctor should comprehend what constitutes ‘informed consent’ as per the operative principle, but also whether the patient’s actions (e.g., uttering “Please, doctor, save me!”) meet the constitutive criteria.

2.2.8 Correctness

Suppose the doctor comprehends the principle’s requirements but arrives at a false belief about whether the patient has provided informed consent and for this reason fails to fulfill their duty toward the patient. The doctor’s mistake constitutes a violation of Correctness, the condition which states that the beliefs required for the proper application of the principle should also be correct [27].

2.2.9 Ability

The agent should possess (or be reasonably expected to acquire) the abilities required to apply and comply with the principle(s) [27]. To determine whether the patient can provide informed consent, for instance, the doctor in our example must possess certain perceptual, cognitive, and deliberative capacities that are required to assess whether the patient is of sound mind, aware of the risks of the surgery, and so on. In some cases, complying with a principle may also require certain physical capacities or practical resources. For example, the doctor might require diagnostic tools and time to properly assess the patient.

Note that Coherence, Comprehension, and Ability relate closely to the maxim ‘ought implies can’: agents should be obligated to perform only actions that they can perform. The maxim highlights that formulating usable principles often requires striking an appropriate balance between moral desirability and practical feasibility. If using a principle necessitates beliefs, abilities, or practical resources which an agent cannot be reasonably expected to possess (or even acquire), the principle is arguably not usable to the agent, at least in the direct sense of action-guidance specified above, because it is overdemanding.Footnote 3 In contrast, a principle might be directly action-guiding but under-demanding in a moral sense; its application might necessitate only beliefs and abilities the agent has and not the beliefs and abilities “we should expect her to have” [27, p. 80; italics added]. This has been discussed by political theorists who note that actionable principles of justice might prove “concessive” if they retrofit the demands of justice to the expected behavior of noncompliant citizens and institutions in an unjust society [27].

3 Fairness in machine learning: operationalizing and implementing ‘fairness’

The notion of action-guidance is highly pertinent to on-going critical discussions on the principle-based frameworks for AI ethics and their usefulness [25, 37]. Frameworks for fairness in ML and the problems that have been identified with respect to their practical implementation provide an illustrative case example for exploring questions related to action-guidance in AI ethics contexts. To provide some necessary background, this section begins by presenting a brief overview of fair ML research and standard frameworks and tools developed therein. The section then describes the evaluative and practical-normative roles that fair ML frameworks play in practical contexts of “fairness-sensitive ML design” and mentions some prominent problems pertaining to their real-life application.

3.1 Fair machine learning frameworks

Broadly speaking, fair ML comprises a field of research that studies bias and (un)fairness in ML applications and develops methods for detecting and correcting problematic biases, which can lead to unfair discrimination or inequitable or unjust outcomes [2, 22, 24]. The field has produced a range of practical frameworks, measures, and tools for aligning ML models with broadly egalitarian principles such as non-discrimination, equality, and justice. Frameworks and other practical resources are also increasingly packaged into software toolkits that responsible agents (e.g., ML model developers and decision-makers) can use to evaluate and improve ML models [20]. Prominent examples of such toolkits include IBM’s AI Fairness 360 [3], UChicago’s Aequitas [30], Google’s What-If tool [11], and Microsoft’s Fairlearn [23].

Standard frameworks for fairness in ML commonly consist of (i) a set of fairness criteria and a corresponding set of fairness metrics and (ii) a set of value implementation methods, such as debiasing techniques, that are used to align the ML model with the chosen criteria. Different frameworks can be defined as variations in these components. A fairness criterion is a computationally tractable formal criterion for model fairness which operationalizes or expresses some more general notion of fairness, equality, or justice. There are dozens of fairness criteria available [24] but I will mention some notable examples. For simplicity, the examples are defined in the context of a binary classification task where D = [0, 1] represents the computed classification, Y = [0, 1] represent the actual class (or “ground truth”), and G = [a, b] represents a protected attribute with two values (e.g., male, female).

  • Predictive parity [7, 18]: The compared groups G experience equal true positive classification rates. Formally: P (D = 1 | Y = 1, G = a) = P (D = 1 | Y = 1, G = b).

  • Equalized odds [13]: The compared groups G experience equal error rates. Formally: P (D = 1 | Y = 0, G = a) = P (D = 1 | Y = 0, G = b) and P (D = 0 | Y = 1, G = a) = P (D = 0 | Y = 1, G = b).

  • Fairness through awareness [7]: Individuals who are similar in terms of some pre-defined set of decision-relevant model features X should be classified similarly. The exact formalization of this criterion depends on the applied measure of similarity.

The agent applying the framework (henceforth “the user”) must usually specify or tailor the chosen set of criteria into fairness metrics that can be applied to evaluate the ML model. For instance, when applying Fairness Through Awareness, the user begins by selecting the data categories or model features (X) that are relevant to making decisions about individuals in a given decision-making context and then constructs a similarity metric that includes those categories. In a medical diagnostic context, the categories might include, for instance, patients’ symptoms and medical history. The user then checks whether the model produces similar predictions for pairs of individuals who are relevantly similar according to that metric. A medical diagnostic model should, according to this metric, produce similar diagnoses for patients with similar symptoms, regardless of their other sensitive characteristics (e.g., ‘race’ or gender).

If bias is identified according to the operative fairness metric(s), a debiasing technique is used to enhance the (unconstrained) model in terms of the metric(s). Debiasing mitigates or eliminates bias according to the metric(s) and produces a new (constrained) model which maximizes overall model performance subject to the satisfaction of the applied metric(s) [8, p. 48]. Existing debiasing techniques are commonly distinguished into (i) pre-processing techniques which intervene on the model’s training data, (ii) in-processing techniques which target the learning algorithm that is used to train the model, and (iii) post-processing techniques which are applied to adjust the outputs generated by the model (see [22]).

3.2 Fair machine learning in practice

Fair ML frameworks (and their components) play specific evaluative and practical-normative roles in the practice of ML model evaluation, improvement, and selection. First, fairness criteria specify so-called (i) ‘ought-to-be’ norms which define what ML models ought to be like in terms of their properties to count as ‘fair’ in some respect or overall and (ii) ‘ought-to-do’ norms which specify corresponding obligations for the user (or other responsible agents).Footnote 4 For example, the fairness criterion Equalized Odds expresses that an ML model should exhibit equal error rates between relevant groups (e.g., protected groups) and therefore establishes that the user(s) should ensure their ML model satisfies this criterion. Second, fairness metrics comprise instruments for quantifying models’ alignment in terms of ‘fairness’, indicating their divergence from the operative fairness criteria [8, p. 59]. Lastly, debiasing techniques constitute the primary means of value implementation, providing a way to ensure models’ compliance with the ought-to-be norms (i.e., fairness criteria) and the user’s compliance with the ought-to-do norms (i.e., building a fair model).

Numerous studies observe that the (correct) application of fair ML frameworks and fairness-sensitive design tools (e.g., software fairness toolkits) remains challenging in real-life ML development. On the one hand, many studies suggest fair ML frameworks provide inadequate conceptual lenses for identifying and addressing unacceptable bias. For example, standard frameworks allegedly fail to capture crucial aspects of fairness and distributive justice [8]—including broader sociotechnical notions of fairness [31]—and lack guidance on addressing tradeoffs between different fairness criteria [8]. These problems underscore the fact that translating principles into well-specified, robust value constructs presents a considerable theoretical operationalization challenge with many looming pitfalls [17]. On the other hand, empirical studies also suggest that framework users are often unable to implement or comply with the proposed strategies for detecting and mitigating bias. This is due to diverse practical constraints ranging from information gaps, insufficient technical and domain-specific skills and resources, and lack of access to data required for model evaluation and debiasing [16, 20, 28, 35].

These challenges suggest that available frameworks are often abstract, ambiguous, or otherwise defective in terms of their content, and that there are also practical obstacles standing in the way of frameworks’ correct application. This aligns with the Dual Requirement Thesis mentioned above. I maintain that the outlined account of action-guidance can help to clarify and address these challenges. Indeed, the next section draws on the account and derives a set of requirements for action-guidance in the context of fair ML. The requirements are used to analyze the identified problems in more detail, and to demonstrate that the problems can be construed as violations of the requirements for action-guidance. However, two disclaimers are due before I proceed. First, recalling the distinction between theories’ veracity and their capacity to guide action stated (see Sect. 2.1), I will bracket the question of whether the correct application of a given fair ML framework is necessary and/or sufficient for actually acting consistently with principles of fairness, equality, and justice. This means that I will also bracket sociotechnical considerations of fairness and instead operate with a more narrow, technical understanding of fairness as model fairness. The focus is on whether the user(s) can apply a standard fair ML framework correctly (as required by the framework) during ML model evaluation, improvement, and selection. Second, available fair ML frameworks and toolkits differ in many respects, which means that specific questions regarding their capacity to guide users’ action cannot be fully answered without attending to those differences. I wish to only draw some general insights about these matters and will therefore focus on problems that concern most available frameworks.

4 Defective construct or defective application? Examining action-guidance in fair machine learning practice

Suppose an ML model developer called Emily is tasked with building a predictive model for college admissions. The model predicts whether an applicant will succeed if admitted to college based on factors such as the applicant’s previous academic performance. To ensure the model is fair, Emily consults a fair ML framework which proposes a set of fairness criteria and a debiasing technique for improving model fairness. What requirements must be met for Emily to be able to apply the framework as a strategy for aligning the prediction model with the proposed criteria? Here, the Construct and Agent Conditions that were outlined in Sect. 2.1. can be used to specify a set of Construct and Agent Requirements for action-guidance in the context of fairness-sensitive ML design. The proposed requirements are summarized in Table 2, and I will flesh them out more fully throughout this section. The list of requirements may not be exhaustive. However, it aligns with the standard philosophical accounts of action-guidance (Sect. 2.1) and provides a general, revisable framework for identifying and analyzing problems concerning action-guidance in fair ML practice.

Table 2 Construct requirements (CR) and agent requirements (AR) for action-guidance in fair ML practice

The aim of the following subsections is to flesh out the proposed requirements and demonstrate their usefulness as conceptual tools for analyzing previously identified problems regarding the real-life application of fair ML frameworks, guidelines, and tools. The problems are also described in more detail throughout this section. I argue that the deficiencies in meeting specific requirements explain when and why action-guidance fails to obtain in the cases that previous research has identified: Failures to satisfy Construct Requirements reveal that there are problems in how fair ML frameworks (or their component parts) have been operationalized, defined, or specified. Underspecification of frameworks’ content induces risks for ambiguous evaluations, misapplication of principles, and other problems. Violations of Agent Requirements are found in cases where the user of the framework (or related tool) lacks the information, abilities, and/or resources required for correct application. Both types of problems may occur simultaneously resulting in multiple violations. For this reason, I will discuss some Construct Requirements and Agent Requirements together under the same subheading.

4.1 Explanation and consistency

The Explanation Requirement (CR1) states that a fair ML framework must furnish a clear and sufficiently detailed account of what renders ML models “(un)fair” in some respect or overall. This necessitates the articulation of properties that are constitutive of (un)fairness in model predictions based on the operationalized target principle(s) that are based on some more general theory of non-discrimination, equality, or justice. The Consistency Requirement (CR2) asserts that a framework should also yield identical evaluations for ML models that are identical in terms of those properties (“treat like models alike”).

The natural route is to view the fairness criteria included in the framework as explanations of model (un)fairness. Equalized Odds states that models are fair when their (expected) rates of false predictions are equal between comparison classes, for instance, whereas Fairness Through Awareness states that fairness requires that similar individuals receive similar predictions. Now, though these criteria present a partial account of model (un)fairness, they are rather abstract due to their formal mode of expression. This gives rise to problems related to underspecification, including instances where users must make value judgments while translating general notions of fairness into contextually applicable fairness metrics [8, p. 59].

Suppose Emily, the user in our running example, follows a framework that prescribes her to implement Fairness Through Awareness in the college admissions model. Implementing this criterion requires Emily to produce or adopt some quantitative measure of similarity that tracks college applicants’ decision-relevant features (e.g., factors related to academic performance) and excludes others (e.g., protected attributes such as gender). However, the criterion itself does not specify what the relevant features are, meaning Emily has to “make moral judgments about what fairness requires […] prior to determining or measuring similarity” [9, p. 2]. The guiding “principle” is underspecified and therefore indeterminate. Similar problems arise with criteria like Equalized Odds because their application is predicated upon the user selecting and operationalizing some relevant set of comparison classes (e.g., protected categories) before actually applying the metric(s). This is no simple task because those comparison classes may include complex social categories (e.g., ‘race’ and ‘gender’) that are notoriously difficult to operationalize and measure [12].

Now, it is clear that some degree of contextual judgment can be reasonably expected from moral agents. Supposing Emily possesses certain prerequisites for moral agency (e.g., sufficient cognitive and deliberative capacities), we can expect her to make reasonable judgments about what features to include in the similarity metric, even without the framework specifying a comprehensive list of such features. Nonetheless, it is clear that available formal fairness criteria lack the necessary semantic and moral distinctions to function as sufficiently action-guiding targets for ML model design—both in the direct and indirect sense [8, 9]. Underspecification also makes fairness criteria prone to misapplication (e.g., “fairness-washing”) [8] and raises doubts about whether fairness evaluations performed by different users of the same framework are reliable and comparable [9]. Robust formal operationalization of fairness notions is one way to mitigate these problems, and recent studies provide ways forward in this regard (see [14, 15]). However, future research should also develop semantically rich verbal models and domain-specific guidelines to enable agents to translate existing criteria into contextually applicable metrics.

4.2 Determinacy requirement

The Determinacy Requirement (CR3) mandates that, upon correct application, fair ML frameworks should (i) indicate whether an evaluated ML model is (im)permissible or obligatory in light of the target principle(s), and (ii) rank the set of evaluated ML models according to their deontic statuses. This ensures that users such as Emily can compare different models with respect to their deontic status and establish whether they are ceteris paribus permitted to choose a given ML model or not.

Due to underspecification, fair ML frameworks can suffer from empirical and normative indeterminacy (see Sect. 2.1 above). Note, for instance, that fairness metrics typically provide quantified estimates of model (un)fairness on a continuous numerical scale. Emily might be able to use such estimates to compare and rank models in terms of the operative fairness criteria. The problem is that she must still determine whether a given amount of prima facie unacceptable bias renders a given model (im)permissible. The available models are not mapped onto deontic statuses, in other words, and hence the framework is empirically indeterminate. Normative determinacy may be compromised in cases where model selection involves tradeoffs, and the framework does not specify how they should be addressed. For instance, many independently plausible fairness criteria such as Equalized Odds (equal error rates) and Predictive Parity (equal true positive classification rates) cannot be implemented simultaneously in most cases [4, 18]. Debiasing an unconstrained model also tends to decrease its overall predictive accuracy, resulting in a fairness–accuracy tradeoff [4, 7]. If the framework fails to specify how Emily should approach these dilemmas, it is normatively indeterminate and Emily must instead rely on some external basis to determine the right course of action [8, p. 62].

The previously described problems notably raise highly delicate substantive moral questions. What amount of bias renders a given model unfair and therefore prima facie impermissible? How should measures of fairness and overall model performance be weighed against each other? Is it even possible to compare these things? From the perspective of action-guidance specifically, however, there are some solutions available. For example, incorporating a clear numeric threshold (or a “cut-off point”) for (im)permissible bias in the model would solve the problem of empirical indeterminacy. Though this approach is arguably somewhat blunt, a well-specified threshold would enable Emily to determine the deontic status of available models in light of the operative fairness criteria.Footnote 5 Note that the threshold might not be universally applicable; the required level of fairness might instead depend on things such as the model’s use-context thereby restricting the scope in which the fairness criteria are intended to apply (see Sect.4.3). It is also worth noting that the problem of empirical indeterminacy does not arise with frameworks that prescribe users to maximize fairness or to minimize unfairness according to some metric(s) (see [15]). The problem is nonetheless pertinent to many existing frameworks which define fairness as parity with respect to some model performance measure between individuals or groups (e.g., equal error rates in Equalized Odds).

What about the problem of normative indeterminacy? Here, a distinction should be made between (i) simple frameworks which incorporate a single fairness criterion and (ii) complex frameworks which include multiple criteria. Suppose it is true, morally speaking, that both Equalized Odds and Predictive Parity should be satisfied for a model to count as ‘fair’. Suppose now that Emily is following a simple framework which includes only one of these criteria, and yet finds that the two criteria clash. In this case, the problem is not that the framework fails to be action-guiding to Emily; it is instead incomplete because it fails to account for all constitutive criteria for model fairness. The problem pertains to the framework’s veracity and completeness as an explanatory account of model fairness. Here, the remedy is straightforward: the framework should be revised to include the missing criterion.

Even the revised complex framework, which now recognizes the two criteria as requirements of model fairness, might violate the Determinacy Requirement if it fails to specify how the user should address the tradeoff between them. But the appropriate response is again to revise the framework. To avoid normative indeterminacy, it could articulate, for example, a decision-rule or a suitable procedure for tradeoff resolution. It might state that Emily should prioritize Equalized Odds over Predictive Parity (or vice versa). It might state that the normative order of priority is dependent on the model’s use-context. It might even state that Emily should decide how to proceed after consulting affected stakeholders (e.g., college applicants). Whatever the tradeoff resolution mechanism, some such mechanism has to be articulated to avoid normative indeterminacy. For an overview of approaches to tradeoff resolution in ML, the reader is referred to [4].

4.3 Constituency and scope

The Scope Requirement (CR5) states that fair ML frameworks should have a well-defined scope of application, and the Constituency Requirement (AR1) asserts that users should know (or be able to find out) if and when an ML model falls within the defined scope.Footnote 6 For example, the framework should specify when and under what circumstances Emily should implement the pre-defined fairness criteria (or some subset of them), and Emily should also be able to consult the framework for this information.

Alas, current research indicates the Scope and Constituency Requirements are frequently unmet. On the one hand, most fair ML frameworks simply explicate what fairness criteria and debiasing techniques should be applied but neglect questions regarding moral and legal responsibilities [8, p. 58]. This is particularly problematic because a model’s application domain and use-context may bear on what notion of fairness should be implemented [36]). On the other hand, empirical studies show that users also tend to lack information (e.g., instructions and comparative guidelines) regarding the proper selection and application of fair ML tools and methods [16, 20]. Both problems contribute to failures of action-guidance, leaving users either unaware of their obligations to mitigate specific biases in the ML model or unable to ascertain what their respective obligations are. A practical solution is to define a clear scope of application for fair ML frameworks and to make the information accessible to users. However, this again requires that fair ML frameworks take a stance on complex moral questions concerning agents’ responsibilities in regard to non-discrimination and the enhancement of equality in local contexts of decision-making.

To illustrate, suppose that Emily’s model for predicting college admissions slightly favors applicants from higher socio-economic classes and, for this reason, risks reproducing structural inequalities in access to higher education. Here, Emily faces the moral question of whether it is right to design the model to select for the most qualified applicants or perhaps boost disadvantaged applicants’ chances to be admitted. But whether Emily should, specifically in her capacity as an ML model developer, intervene on the identified structural bias cannot be settled without considering various agents’ and stakeholders’ duties and obligations and other factors, such as the effectiveness of technological versus political interventions in enhancing equity in access to higher education [31]. Emily must consider whether non-discrimination law requires the college (qua education provider) to enhance equality between different socio-economic classes, for instance, and whether the college has potential conflicting duties toward other stakeholders (e.g., qualified applicants from higher socio-economic classes). Emily’s responsibilities are determined also in relation to the responsibilities that other agents (e.g., other education providers in the area, local policymakers, and the state) have in regard to dismantling structural barriers to higher education access. Overall, these multifaceted considerations underscore the importance of understanding the causes and sources of biases in ML models and how they connect to users’ and other stakeholders’ duties and responsibilities [8, pp. 60–61]. Indeed, these matters are crucial to determining right courses of action, but also relevant to defining the proper scope and constituency of fair ML frameworks.

4.4 Comprehension and correctness

The Comprehension Requirement (AR2) posits that users of fair ML frameworks should (be able to) recognize and possess or acquire the information required to correctly apply and comply with the chosen framework. This is supplemented by the Correctness Requirement (AR3) which adds that the acquired information should be accurate. For instance, Emily must be able to ascertain what properties or components she must observe and manipulate to properly evaluate and improve the college admissions model’s fairness. In standard fair ML frameworks, these things include, for instance, data distributions, model features, and/or output distributions.Footnote 7 To produce sound assessments of the model’s fairness, and to derive coherent prescriptions regarding its improvement, Emily must also possess or acquire accurate information on these matters.

A widely observed problem is that acquiring the required information can be infeasible to users. For example, computing fairness measures oftentimes demands access to data on protected attributes (e.g., ‘race’ or gender) and sometimes also “ground truth” data (e.g., “true” answers to the prediction problems). However, access to data on protected characteristics is commonly restricted by data protection regulations and organizations’ internal policies [16, 35]. Even when available, the data may be incomplete, inconsistent, or otherwise unusable [12]. Similar problems pertain to “ground truth” data which can be unavailable by default since only realized decision outcomes (as opposed to counterfactual outcomes) tend to be recorded. Emily might be able to estimate the model’s true positive rates based on available data on the performance of admitted college applicants (e.g., completed courses and grades), but estimating false rejection rates is difficult since similar data cannot be captured on rejected applicants.

The previously described problems are suggestive of violations of Comprehension and Correctness. This conclusion gains further credibility once we recognize that determining what kind of data is required to evaluate model fairness, and how such data can be accessed, may depend heavily on context. Indeed, “[a]ssessing and mitigating unfairness in ML systems can depend on nuanced cultural and domain knowledge” which can be scarce in many ML development teams—at least ones lacking domain-experts [16, p. 7]. Supposing Emily is not an expert on education-related matters, she may struggle to ascertain what kinds of data she needs to evaluate the model’s fairness. Finding solutions to the described problems is again a considerable challenge. Problems with missing or incomplete data might be addressed in part through the development of novel methods for fairness evaluation (see [21]). However, there are no simple technological solutions to other prevalent issues such as ML development teams’ shortage in (domain-)expertise, localized knowledge, and practical resources. Making fairness-sensitive design accessible to diverse user demographics (including non-experts) may require creating domain-specific educational resources, guidelines, processes, and tools, but also implementing legislative and organizational changes that improve access to accurate testing data (see [16, 20, 28, 35]).

4.5 Coherence and ability

The Coherence Requirement (CR5) states that fair ML frameworks should not prescribe actions that do not cohere with relevant empirical facts, and the Ability Requirement (AR4) states that the user must have (or be able to acquire) the abilities required to act in conformity with the framework’s prescriptions. Here, we center again on questions of feasibility, though this time specifically in relation to improving and selecting models according to the chosen framework. In particular, we must consider whether ML models can feasibly satisfy the operative fairness criteria (i.e., comply with ought-to-be norms) and whether users can build models that adhere to those criteria (i.e., comply with ought-to-do norms).

Consider here the well-known tradeoff Equalized Odds and Predictive Parity which has been viewed as evidence of the “impossibility of fairness” in predictive models. Specifically, the so-called impossibility results show that whenever protected groups (G) exhibit different base rates with regard to the predicted attribute (Y) and the predictive model is “imperfect” in that makes mispredictions, the two criteria cannot be satisfied simultaneously [18]. To be clear, the two criteria are mathematically compatible and thus theoretically coherent. The problem is that satisfying both criteria fully is practically infeasible in realistic settings because different groups tend to have unequal base rates and prediction errors are rather unavoidable. In Emily’s case, the impossibility results suggest that, if there are differences in academic performance between students from different ethnic groups, for instance, the college admissions model cannot exhibit both equal misprediction rates and true positive rates between those groups.

Suppose now that the fair ML framework states that Emily’s college admissions model should satisfy both Equalized Odds and Predictive Parity (or some other set of practically conflicting criteria). Are the Coherence and Ability Requirements violated in this case? Even if the criteria are indeed theoretically compatible, our intuition might lean toward an affirmative answer. After all, the framework prescribes Emily to eliminate all unfair disparities between ethnic groups in the model (as per the operative criteria) even though any model available to Emily will exhibit some such disparity (see [8, p. 62]). Nonetheless, I think that there are two reasons to think the problem is not as severe as it might seem. The first one is that an affirmative answer to the question presupposes an absolutist understanding of fairness criteria—a model is deemed permissible only if it does not exhibit any bias according to the pre-defined metric(s). The assumption is that Emily is only allowed to choose a model that satisfies Equalized Odds and Predictive Parity perfectly but not, for example, a model that closely approximates the criteria. The absolutist view notably differs from minimizing and maximizing approaches to model fairness which were mentioned in Sect. 4.2. It is also a rather uncommon view, and conflicts with the widely held notion that, in certain cases, such as in circumstances where ‘perfectly fair’ models are simply not available, users may be permitted to relax the fairness criteria.

Second, it is also worth noting that, in the described scenario, the framework fails to be action-guiding to Emily in the direct sense: since the conjunction of Equalized Odds and Predictive Parity does not specify a model which Emily both should and can build, the Coherence and Ability Requirements are violated. Recall, however, that principles which prescribe idealistic but currently unattainable targets can still be action-guiding in an indirect manner. Some understand principles of justice to function like this: the principles specify a target state-of-affairs that society should seek to bring about and help agents identify heinous injustices (see [27]). In circumstances of infeasibility, the two fairness criteria are similarly indirectly action-guiding: they describe what an ML model ideally ought to be like and thus specify a desirable target to be achieved in the long run. Emily may not be able be build the fairest model now, but she may be possible to gradually improve the model’s fairness through time. Due to technological advancements and coordinated collective efforts to decrease differences in academic performance between ethnic groups, building even fairer models may become possible in the future. Meanwhile, the criteria orient Emily into the right direction by describing what kinds of biases she should seek to mitigate in currently feasible models, even if eliminating such disparities altogether is not currently possible.

5 Concluding remarks

This article has explored challenges related to action-guidance in AI ethics. I began by outlining a philosophical account of action-guidance which includes a set of conditions for action-guidance and situates the construction of usable moral principles as one practical desideratum of moral theorizing alongside the theoretical desideratum of explaining moral properties and moral behavior. The account stands as an independent philosophical contribution of this work which can be refined and further supplemented by future studies on the topic. I demonstrated that the proposed account can be used to address what I have called the action-guidance gap of AI ethics which arises when responsible agents, such as ML developers, cannot use AI ethics frameworks as practical “guides” to build ML systems that align with the frameworks’ principles. Demonstrating the applicability of the account, I centered construction and application of fair ML frameworks in ML development contexts as a case example, also drawing on philosophical and empirical research on the content and practical application of existing frameworks. Future research could apply the account to frameworks and resources developed in other (sub)domains of AI ethics, such as transparency and model explainability, accountability, and environmental sustainability.

The outlined account has three general uses in the AI ethics domain, which I will reiterate next. First, the account can be used to specify Construct Requirements which indicate what kinds of content should be included into principle-based frameworks for AI ethics to ensure their proper application. I demonstrated this by specifying five Construct Requirements for fair ML frameworks (CR1–CR5 in Table 2) which highlight the need for a consistent set of well-specified fairness criteria and comparison classes, a clear scope of application, and detailed instructions regarding the evaluation and improvement of model fairness. This converges with findings from previous studies [8, 9, 12, 20]. Second, the account can be used to establish corresponding Agent Requirements understood as prerequisites for the correct application of AI ethics frameworks on part of their users. I derived four Agent Requirements for action-guidance in fair ML practice (AR1–AR4 in Table 2) which indicate that users require, for instance, accurate evaluation data, domain- and context-specific information, and skills and practical resources to correctly implement fair ML frameworks and related tools. This aligns with findings from empirical studies on fair ML practice (see [16, 20, 28, 35]).

Lastly, the outlined account can be used in a negative mode to analyze issues with action-guidance that arise in real-life AI ethics contexts. In particular, I showed that previously identified issues with action-guidance in fair ML practice can be construed as violations of the proposed construct and agent requirements (Table 2). Broadly speaking, the examination showed that fair ML frameworks can fail be action-guiding to users when they contain underspecified proxies for the operationalized principle(s) (or some broader theory) or when users lack the beliefs or abilities required to apply the framework correctly. This finding underlines that failures of action-guidance can be traced to defective principles (e.g., underspecified fairness criteria) and/or the defective application of principles (e.g., misapplication of fairness criteria). In addition, I noted that, due to practical constraints, users may be unable to apply even well-specified frameworks or unable to derive feasible prescriptions from such frameworks. In such cases, the applied framework is not directly action-guiding to the user. However, insofar as the framework is sufficiently specified, it can still provide indirect guidance to the user by specifying when an ML system deviates from the operative principle(s) and by specifying a long-term target that the user should seek to achieve, perhaps in coordination with other agents.

Though improving available frameworks and promoting users’ capacity to implement them constitute promising ways forward, there are no simple solutions to bridging the action-guidance gap in AI ethics altogether. This becomes increasingly evident once we recognize the plurality of values that ML systems should reflect, and the fact that ML systems increasingly operate embedded in complex, dynamic sociotechnical systems. Even a clear strategy for ethical action may prove insufficient in a complex system with many moving parts and many goals to be achieved. In this light, even the most detailed and informationally rich AI ethics frameworks (or ethics guidelines) might not eliminate the need for users to engage in contextual moral deliberation in real-life contexts—nor should we expect them to. Nonetheless, insofar as users require practical resources to implement AI ethics principles, ensuring that such resources are usable in real-life settings remains a crucial task. Hopefully, the proposed account can assist in this project by providing insight into the prerequisites of actionable AI ethics frameworks and their proper implementation.