In this final section, we propose a new approach that moves away from attempts to define fairness mathematically, and instead, gain a more holistic view of the ethical considerations of a model. Due to the subjectivity of fairness metrics, it may be challenging to select one over another. Rather than these general metrics, decision-makers should create a customised measurement of what “fair” looks like in each model. In addition, fairness should not be considered in isolation from the related ethical goals. The interaction between fairness and other values—e.g. welfare, autonomy, and explicability—should be taken into account in this analysis.
Contrary to claims otherwise [28], the roles and responsibilities of an engineer are necessarily intertwined with the role of the expert or business stakeholder, as the ethical and practical valuations of what “success” looks like in the model directly influences the algorithm design, build, and testing. It is important to have active engagement from the beginning between the developer and the subject matter expert to try to understand which inequalities should influence the outcome and how to address the inequalities that should not play a role in the prediction. This process requires engagement from all relevant parties, including the business owner and the technical owner, with potential input from regulators, customers, and legal experts.
Relying solely on the out-of-the-box fairness definitions as implemented in fairness toolkits would fail to capture the nuanced ethical trade-offs. There are opportunities for open source communities, technology companies, and other practitioners to contribute to the toolkits to improve them; while this is out of scope of our paper, we have identified the key gaps in our previous paper [42].
For a decision-maker, it is important to devise customised success metrics specific to the context of each model, which as we described, involves considering welfare (beneficence, non-maleficence), autonomy, fairness, and explicability. This can be done in a following process:
-
1.
Define “success” from an ethical perspective. What is the benefit of a more accurate algorithm to the consumer, to society, and to the system? What are the potential harms of false positives and false negatives? Are there any fundamental rights at stake?
-
2.
Identify the layers of inequality that are affecting the differences in outcome
-
3.
Identify the layers of bias
-
4.
Devise an appropriate mitigation strategy. Note this may require changes to data collection mechanism or to existing processes, rather than a technical solution.
-
5.
Operationalise these objectives into quantifiable metrics, build multiple models and calculate the trade-offs between the objectives covering all ethical and practical dimensions.
-
6.
Select the model that best reflects the decision-maker’s values and relative prioritisation of objectives.
We now elaborate each of these steps, in turn.
Define success
For each use case, there are unique considerations on what is considered a “successful” model, which are unlikely to be captured in a single mathematical formula. In credit risk evaluation, for example, three key objectives from ethical, regulatory, and practical standpoints are: (1) allocative efficiency: a more accurate assessment of loan affordability protects both the lender and the customer from expensive and harmful default, (2) distributional fairness: increasing access to credit to disadvantaged borrowers, including “thin-file” borrowers and minority groups, (3) autonomy: both increased scope of harm due to identity theft and security risk and due to the effects of ubiquitous data collection on privacy [2]. Here, a successful credit risk model would achieve all three objectives. By contrast, in algorithmic hiring, success metrics may include employee performance, increased overall diversity among employees and in leadership, and employee satisfaction with the role. It is important to identify all the objectives of interest, such that any trade-offs between them may be easily identified, allowing for a more holistic view of algorithmic ethics.
Identify sources of inequality
As previously discussed, due to the complex and entangled sources of inequalities and bias affecting an algorithm, there is no simple mathematical solution to unfairness. It is important to understand what types of inequality are acceptable vs. unacceptable in each use case. Table 2 presented different layers of inequality. Considering a credit risk evaluation, socioeconomic and talent inequalities may be considered relevant: if a man has a higher income than a woman, he may receive a higher credit limit given his higher ability to repay; higher education level and expertise in a high-demand field may indicate greater job security. Forcing the decision-maker to look beyond the legally protected characteristics to identify the inequalities that are acceptable and relevant and those that are not helps better identify the sub-groups that are at risk of discrimination.
We previously claimed that computer scientists and model developers should actively engage in the discussion on what layers of inequality should and should not be influencing the model’s prediction in order to inform their decisions in the development process. Proposing an accountability mechanism, such as the assignment of roles and responsibilities, is not in scope for this paper. However, we acknowledge the importance of the topic; we have addressed how to embed risk government in AI development lifecycle in a previous paper [40], and we have mapped tools and techniques for unfair bias mitigation to a standard organisational risk management lifecycle [43]. We have also proposed a framework for “reviewability” to ensure the logs, reporting, and audit trail are fit for purpose for understanding the AI models [8]. In these papers, we emphasise the need for ethical principles to be operationalised into practice and embedded into organisational processes, ensuring that the right stakeholders are involved at the appropriate stage and that the accountability and responsibility of each ethical risk is clear.
Identify sources of bias
In addition to the inequalities discussed above, there may be biases in the model development lifecycle that exacerbate the existing inequalities between two groups. The challenge is that in many cases, the patterns associated with the target outcome are also associated with one’s identity, including race and gender.
Suresh and Guttang [54] have recently grouped these types of biases into 6 categories: historical, representation, measurement, aggregation, evaluation, and deployment. Historical bias refers to past discrimination and inequalities, and the remaining five biases, displayed in Table 5, align to the phases of the model development lifecycle (data collection, feature selection, model build, model evaluation, and productionisation) that may inaccurately skew the predictions. By understanding the type of bias that exists, the developer can identify the phase in which the bias was introduced, allowing him or her to design a targeted mitigation strategy for each bias type.
Table 5 gives examples in racial discrimination in lending processes to demonstrate each type of bias. For a practical tool in identifying unintended biases in these six categories, see [43]. Crucially, they point out that effective bias mitigation addresses the bias at its source, which may involve a non-technical solution. For example, bias introduced through the data collection process may require a change in marketing strategy.
Table 5 Layers of bias resulting in inaccurate predictions (partial and indicative) Design mitigation strategies
The mitigation strategy depends on whether we believe the inequalities in Table 2 and the biases in Table 5 need to be actively corrected to rebalance the inequalities and bias. It is important to understand the source of the bias in order to address it.
There have been existing methods proposed for pre-processing, removing bias from the data before the algorithm build, in-processing, building an algorithm with bias-related constraints, and post-processing, adjusting the output predictions of an algorithm. However, these methods presume that inequalities in Table 2 and the biases in Table 5 are known and can be quantified and surgically removed. How do we isolate the impact of talent and preference inequalities on income from the impact of discrimination? The attempt to “repair” the proxies to remove the racial bias has been shown to be impractical and ineffective when the predictors are correlated to the protected characteristic; even strong covariates are often legitimate factors for decisions [11].
Often, the solution to these biases is not technical because their sources are not inherent in the technique. Instead of looking for a mathematical solution, there may be productive ways of counteracting these biases with changes to the process and strategy. Examples are shown in Table 6.
Table 6 Possible actions to counteract biases (*partial and indicative) While the mitigation strategies are important, they are unlikely to provide a complete solution to the problem of algorithmic bias and fairness. That is because—unlike the assumptions underlying fairness formalisations—it is often not feasible to mathematically measure and surgically remove unfair bias from a model, which is affected by inequalities and biases that are deeply entrenched in society and in the data.
Legal scholars have argued that traditional approach of scrutinising the inputs to a model is no longer effective due to the rising model complexity. Using Fair Lending law as an example, Gillis demonstrates that identifying which features are relevant vs. irrelevant fails to address discrimination concerns because combinations of seemingly relevant inputs may drive disparate outcomes between racial group [24]. Rather than focusing on identifying and justifying inputs and policies that drive disparities, Gillis argues, it is important to shift to an outcome-focused analysis of whether a model leads to impermissible outcomes [24]. Similarly, Lee and Floridi have proposed an approach to assess whether the outcome of a model is desirable [41]. For a more comprehensive analysis of whether a model meets the stakeholders’ ethical criteria, it is important to look beyond the inputs and the designer’s intent and assess the long-term and holistic outcome.
Operationalise Key Ethics Indicators (KEIs), calculate trade-offs between KEIs
Once “success” for a model has been defined at a high-level, the next step is to operationalise the ethical principles such that they are measurable. Similarly to how a company may define a set of quantifiable values to gauge its achievements using Key Performance Indicators (KPIs), there should be outcome-based, quantifiable statements from an ethical standpoint: Key Ethics Indicators (KEI), enabling developers to manage and track to what extent each model is meeting the stated objectives.
For example, Lee and Floridi estimate the impact of each default risk prediction algorithm on financial inclusion and on loan access for black borrowers [41]. They operationalise financial inclusion as the total expected value of loans under each model and minority loan access as the loan denial rate of black applicants under each model. In Fig. 1 replicated from their work, they calculate the trade-offs between the two objectives for five algorithms, providing actionable insights for all stakeholders on the relative success of each model.
Context-specific KEIs can be developed for each use case. For example, in algorithmic hiring, employee satisfaction with a role may be estimated by attrition rates and employee tenure, employee performance may be measured through their annual review process, and diversity may be calculated across gender, university, region, age group, and race, depending on each organisation’s objectives and values. Making explicit the ethical objectives in each use case would help decision-makers justify the use of any algorithm, which could in turn lead to the establishment of industry standards, informing best practices, policy design, and regulatory activity.
Select a model and provide justifications
The trade-off analysis makes the ethical considerations clear. For example, in Fig. 2, Lee and Floridi conclude that Random forest is better in absolute terms (in both financial inclusion and impact on minorities) than Naïve Bayes, but the decision is more ambiguous between CART and LR: while CART is more accurate and results in greater financial inclusion (equivalent of $15.6 million of loans, or 103 median-value loans), CART results in a 3.8 percentage points increase in denial rates for black loan applicants compared to LR. This quantifies the concrete stakes to the decision-maker who may decide on the model that is most suited to his or her priorities, customised to each use case.
One of the key benefits of the outcome-driven KEI trade-off analysis is that it provides interpretable and actionable insights into the decision-maker’s values, which is especially important for complex machine learning algorithms in which the exact mechanism may not be transparent or interpretable. This could also provide valuable justification to the regulator on why a certain model was seen as preferable to all other reasonable alternatives. This may also help reduce the hesitation among decision-makers around the use of machine learning models due to their non-transparent risks, if the analysis shows they are superior to traditional rules-based models in meeting each of the KEIs. Suitable records of the decisions must be kept, ensuring the model and its design are reviewable [8].