This section describes the elements we have identified in the literature. We have categorized them as either features or practices. They are summarized in Tables 2, 3 and 4, and are described in detail in the following sections.
Built-in Safeguards Against Harmful Behavior
This feature introduces procedural safeguards limiting what AI systems can do unilaterally. One such safeguard is to make the automated decision-making process itself adversarial. This can be achieved by introducing a second automated system external to the controlling organization, which machine decisions are run through. If disagreement between both systems occurs, decision can be flagged for human review, or automated dispute resolution mechanisms can take over. Such adversarial procedures could occur on an ongoing basis, or at the request of human controllers or decision subjects. An additional benefit of a second (possibly public) system that decisions need to pass through is the creation of a record of all decisions made, which can aid outside scrutiny (Almada, 2019; Elkin-Koren, 2020; Lyons et al., 2021).
In some cases, it may be necessary and possible to implement formal constraints on system behavior. These would protect against undesired actions, and demonstrate compliance with standards and legislation (Aler Tubella et al., 2020).
Interactive Control Over Automated Decisions
This feature is primarily aimed at human controllers, although in some cases it may also be made available to decision subjects. It enables direct intervention in machine decisions. In HCI, the concept of mixed-initiative interaction refers to shared control between intelligent systems and system users. Such an approach may also be employed in the case of decision-support or semi-automated decisions. The final decision would be the result of a “negotiation” between system and user (Kluttz & Mulligan, 2019; Novick & Sutton, 1997 in Vaccaro et al., 2019) In some cases it may be possible to allow users to correct or override a system decision. This is of particular importance in a decision-support setting, where such corrections may also function as a feedback loop for further system learning (Bayamlıoğlu, 2021; Hirsch et al., 2017; Vaccaro et al., 2019, 2020). Where direct override is not a possibility, some form of control can be offered in an indirect manner by allowing users to supplement the data a decision is based on with additional contextual information (Hirsch et al., 2017; Jewell, 2018).
Explanations of System Behavior
This feature is primarily aimed at decision subjects but can also be of use to human controllers. It helps actors understand the decisions made by AI systems. A decision subject should know a decision has been made, that there is a means of contesting, and be provided with an explanation of the decision (Lyons et al., 2021). Explanations should contain the information necessary for a decision subject to exercise their rights to human intervention and contestation (Bayamlıoğlu, 2021; Lyons et al., 2021; Ploug & Holm, 2020).
Individual decisions should be reproducible and traceable. It should be possible to verify the compliance of individual decisions with norms. This requires version control, and thorough record-keeping (Aler Tubella et al., 2020). Simply keeping an internal log could already be a huge improvement. These records should include the state of the model, the inputs, and decision rules at the time of producing a specific outcome (Bayamlıoğlu, 2021). The norms decisions should adhere to should be elicited and specified ex ante (Aler Tubella et al., 2020).
Explanations should not simply be a technical account of how a model’s output relates to its input. It should also include the organizational, social and legal context of the decision. In other words, the emphasis shifts from explaining the computational rules to the decision rules, offering a behavioral model of the AI system as a whole, from a sociotechnical perspective (Aler Tubella et al., 2020; Almada, 2019; Brkan, 2019; Crawford, 2016; Hirsch et al., 2017). This behavioral approach accounts for the limitations of transparency efforts focusing on “the algorithm” in isolation (Ananny & Crawford, 2018 in Henin & Le Métayer, 2021). It also seeks to strike a balance between usability and comprehensiveness, in an effort to avoid the “transparency paradox” (Nissenbaum, 2011 in Crawford, 2016).
These requirements should be satisfiable even for models that are opaque due to their technical nature. Nevertheless, it may be desirable to reduce model complexity, e.g. by limiting the number of features under consideration, or by using fundamentally more intelligible methods (e.g. decision trees vs. deep neural networks) (Bayamlıoğlu, 2021).
Although explanations may be of a static form, if deep understanding and exploration of counterfactual scenarios is desired, “sandboxing” or “black box in a glass box” approaches are worth considering. Using these approaches, users are able to manipulate inputs and see how these affect outputs. These techniques can work without needing to fully describe decision rules, which may be useful for cases where these cannot or will not be disclosed (Höök et al., 1998 in Hirsch et al., 2017). By offering explanations that include confidence levels, human controllers can direct their focus to those decisions warranting closer scrutiny (Hirsch et al., 2017; Vaccaro et al., 2019).
Another way to deal with model opacity (due to their proprietary or sensitive nature) is to generate local approximations using techniques such as “model inversion”. However, once again we emphasize not to fixate on the technical components of AI systems in isolation (Hirsch et al., 2017; Leahu, 2016; Mahendran & Vedaldi, 2015; Ribeiro et al., 2016; Tickle et al., 1998 in Edwards & Veale, 2018).
Explanations in the service of contestability should not simply describe why a decision was made, but also why the decision is considered good. In other words, decision subjects should receive a justification as well. This avoids the self-production of norms (Rouvroy, 2012 in Henin & Le Métayer, 2021).
Human Review and Intervention Requests
This feature is aimed at decision subjects, and third parties acting on behalf of decision subject individuals and groups. It gives subjects the ability to “ask questions and record disagreements”, both on the individual and the aggregate scale (Hirsch et al., 2017; Ploug & Holm, 2020; Vaccaro et al., 2019).
Human controllers and decision subjects should not be mere passive recipients of automated decisions. They should be put in dialogue with AI systems. Reliance on out-of-system mechanisms for contestation is insufficient (Kluttz et al., 2019 in Henin & Le Métayer, 2021).
A commonly recommended mechanism for responding to post-hoc contestation is human review and intervention (Lyons et al., 2021). Requests for human intervention are necessarily post-hoc, since they happen in response to discrete decisions, when a subject feels a decision has harmed or otherwise impacted their rights, freedoms or interests (Almada, 2019). Such intervention requests could be facilitated through auxiliary platforms, or be part of the system itself (Almada, 2019; Bayamlıoğlu, 2021). Although existing internal or external review procedures are sometimes considered sufficient, in many cases new mechanisms for contestation will be required. Due process mechanisms should be designed into the AI systems itself (Lyons et al., 2021).
Human review is seen as an antidote to machine error. Human controllers can use tacit knowledge, intuition, and access to contextual information to identify and correct harmful automated decisions. In this way, allowing for human intervention is a form of quality control (Almada, 2019; Walmsley, 2021).
In the context of GDPR the right to human intervention is tied to fully automated decision-making only (Brkan, 2019). In practice, such a distinction may not be so clear-cut. From a sociotechnical perspective humans are always part of the decision chain leading up to a machine decision, in the role of designers, developers and operators. Furthermore, the mere presence of a human at the very end of the chain (the so-called “human in the loop”) may not be a sufficient safeguard against machine error if human controllers do not have the authority or ability to base their final decision on more information than what was provided to them by the AI system (Almada, 2019). By extension, human controllers who respond to intervention request should have the authority and capability to actually change previous decisions (Brkan, 2019).
It is of course entirely possible for human intervention to be biased, leading to worse outcomes compared to a fully automated decision. This should be guarded against by introducing comparative measures of the performance of human-controlled and fully automated procedures (Almada, 2019). AI system controllers must make room within their organizations for receiving, evaluating and responding to disputes (Sarra, 2020).
Channels for contestation should be clear, accessible, affordable and efficient so that further harm to subjects is minimized (Lyons et al., 2021; Vaccaro et al., 2021). Mechanisms for requesting human intervention should provide “scaffolding for learning” (Applebee & Langer, 1983; Salehi et al., 2017 in Vaccaro et al., 2020). Documentation of the decision-making procedures should be integrated with the appeal procedure and communicated in alternative formats to ease comprehension (Vaccaro et al., 2020) and to help subjects in formulating their argument (Lyons et al., 2021; Vaccaro et al., 2021)
A risk of appeal procedures is that burdens are shifted to individual subjects. Ways of addressing this include allowing for synchronous communication with decision makers (Vaccaro et al., 2021), or to have third parties represent subjects (Bayamlıoğlu, 2021; Edwards & Veale, 2018; Lyons et al., 2021; Vaccaro et al., 2020).
Another limitation of current appeal procedures is that they handle decisions individually (Vaccaro et al., 2019). Groups should be able to acquire explanations of decisions collectively. Developers should not only consider individual impacts, but also group impacts (Edwards & Veale, 2018). Mechanisms for contestability should allow for collective action, because harms can be connected to group membership (Lyons et al., 2021). One way to aid collective action would be to publicize individual appeals cases so subjects can compare their treatment to those of others, and identify fellow sufferers (Matias et al., 2015; Myers West 2018; Sandvig et al., 2014 in Vaccaro et al., 2020). Subjects should be supported in connecting to those who share their fate (Vaccaro et al., 2021).
Any kind of human intervention in response to decision subjects’ appeals may not qualify as actual contestation. Decision subjects should be able to express their point of view, if only to provide additional information based on which a decision may be reconsidered (Bayamlıoğlu, 2021). For true contestation to be the case, not only should the subject be allowed to express their point of view, but there should also be a dialectical exchange between subject and controller (Mendoza & Bygrave, 2017 in Brkan 2019). Therefore, contestation includes human intervention, but should not be reduced to it. Care should also be taken to avoid contestability becomes merely a way for subjects to complain about their plight. This means contestations of these kinds cannot be handled in a fully automated fashion, because a dialectic exchange is not possible in a meaningful sense between humans and machines. Computational logic can only offer an answer to the “how”, whereas a proper response to a contestation must also address the “why” of a given decision (Sarra, 2020). Contestability should include a right to a new decision, compensation of harm inflicted, or reversal (Lyons et al., 2021).
Tools for Scrutiny by Subjects or Third Parties
This feature supports scrutiny by outside actors (decision subjects, indirect stakeholders, third parties) of AI systems, separate from individual decisions. These tools for scrutiny mainly take the form of a range of information resources.
These should contribute to the contestability of the sociotechnical system in its entirety (Lyons et al., 2021). The aim is to justify the system as a whole (i.e. “globally”), rather than individual decisions (“locally”). This requires the demonstration of a clear link between high-level objectives (norms external to the technical system) and its implemention. Compliance is established by tracing this link through requirements, specifications, and the code itself.
Documentation should describe the technical composition of the system (Vaccaro et al., 2020). Such documentation may include up-to-date system performance indicators, in particular related to training data and models. Further documentation should describe how the system was constructed (i.e. documentation of the design and development process) (Selbst & Barocas, 2018 in Almada 2019), the role of human decision-makers, group or systemic impacts and how they are safeguarded against (Lyons et al., 2021). Mitchell et al. (2019) and Gebru et al. (2020) offer examples of possible documentation approaches.
Formal proof of compliance may be possible when a system specification can be described unambiguously, and its implementation can be verified (semi-)automatically. However, ML-based systems cannot be described using formal logic. Their performance is better assessed through statistical means (Henin & Le Métayer, 2021).
If a system makes a fully automated decision, it is recommended to include a means of comparing its performance to an equivalent decision-making procedure made by humans (Cowgill & Tucker, 2017 in Almada 2019).
If confidential or sensitive information must be protected that would aid in the assessment of proper system performance, it may be possible to employ “zero-knowledge proofs” in order to provide so-called opaque assurances (Kroll et al., 2016 in Almada 2019).
This practice focuses on the earliest stages of the AI system lifecycle, during the business and use-case development phase. It aims to put in place policy-level constraints protecting against potential harms. Developers should make an effort to anticipate the impacts of their system in advance (Brkan, 2019; Henin & Le Métayer, 2021; Sarra, 2020), and pay close attention to how the system may “mediate” new and existing social practices (Verbeek 2015 in Hirsch et al., 2017). If after an initial exploration it becomes clear impacts are potentially significant or severe, a more thorough and formalized impact assessment should be performed (e.g. Data Protection Impact Assessments (DPIA)) (Edwards & Veale, 2018; Lyons et al., 2021). Such assessments can also enforce production of extensive technical documentation in service of transparency, and by extension contestability (Bayamlıoğlu, 2021). Any insights from this act of anticipation should feed into the subsequent phases of the AI system lifecycle. Considering AI system development tends to be cyclical and ongoing, anticipation should be revisited with every proposed change (Schot & Rip, 1997 in Kariotis and Mir 2020). If system decisions are found to impact individuals or groups to a significant extent, contestability should be made a requirement (Henin & Le Métayer, 2021). A fairly obvious intervention would be to make contestability part of a system’s acceptance criteria. This would include the features identified in our framework, first and foremost means of acquiring explanation and human intervention (Almada, 2019; Brkan, 2019; Walmsley, 2021). Questions that must be answered at this point include what can be contested, who can contest, who is accountable, and what type of review is necessary (Lyons et al., 2021).
A final type of ex-ante safeguard is certification. This can be applied to the AI system as a software object, by either specifying aspects of its technological design directly, or by requiring certain outputs that enable monitoring and evaluation. It may also be applied to the controlling organization as a whole, which from a sociotechnical perspective is the more desirable option, seeing as how automated decisions cannot be reduced to an AI system’s data and model. However, certificates and seals are typically run in a for-profit manner and depend on voluntary participation by organizations. As such they struggle with enforcement. Furthermore, there is little evidence that certificates and seals lead to increased trust on behalf of subjects (Bayamlıoğlu, 2021; Edwards & Veale, 2018).
Agonistic Approaches to ML Development
This practice relates to the early lifecycle phases of an AI system: business and use-case development, design, and training and test data procurement. The aim of this practice is to support ways for stakeholders to “explore and enable alternative ways of datafying and modeling the same event, person or action” (Hildebrandt, 2017 in Almada 2019). An agonistic approach to ML development allows for decision subjects, third parties, and indirect stakeholders to “co-construct the decision-making process” (Vaccaro et al., 2019). The choices of values embedded in systems should be subject to broad debate facilitated by elicitation of the, potentially conflicting, norms at stake (Henin & Le Métayer, 2021). This approach stands in contrast to ex-post mechanisms for contestation, which can only go so far in protecting against harmful automated decisions because they are necessarily reactive in nature (Almada, 2019; Edwards & Veale, 2018). In HCI, a well-established means of involving stakeholders in the development of technological systems is participatory design (Davis, 2009 in Almada 2019). By getting people involved in the early stages of the AI lifecycle, potential issues can be flagged before they manifest themselves through harmful actions (Almada, 2019). Participants should come from those groups directly or indirectly affected by the specific AI systems under consideration. Due to the scale at which many AI systems operate, direct engagement with all stakeholders might be hard or impossible. In such cases, representative sampling techniques should be employed, or collaboration should be sought with third parties representing the interests of stakeholder groups (Almada, 2019). Representation can be very direct (similar to “jury duty”). Or more indirect (volunteer or elected representatives forming a board or focus group) (Vaccaro et al., 2021).
Power differentials may limit the degree to which stakeholders can actually affect development choices. Methods should be used that ensure participants are made aware of and deal with power differentials (Geuens et al., 2018; Johnson, 2003 in Kariotis and Mir 2020).
One-off consultation efforts are unlikely to be sufficient, and run the risk of being reduced to mere “participation theater” or a ticking-the-box exercise. Participation, in the agonistic sense, implies an ongoing adversarial dialogue between developers and decision subjects (Kariotis & Mir, 2020).Footnote 2 AI systems, like all designed artifacts, embody particular political values (Winner, 1980 in Crawford 2016). A participatory, agonistic approach should be aimed at laying bare these values, and to create an arena in which design choices supporting one value over an other can be debated and resolved (although such resolutions should always be considered provisional and subject to change) (Kariotis & Mir, 2020). König and Wenzelburger (2021) offer an outline of one possible way of structuring such a process.
Quality Assurance During Development
This practice ensures safe system performance during the development phases of the AI system lifecycle. This includes collection of data and training of models, programming, and testing before deployment. A tried and true approach is to ensure the various stakeholder rights, values and interests guide development decisions. Contestability should not be an afterthought, a “patch” added to a system once it has been deployed. Instead developers should ensure the system as a whole will be receptive and responsive to contestations. Care should also be taken to understand the needs and capabilities of human controllers so they will be willing and able to meaningfully intervene when necessary (Kluttz et al., 2018; Kluttz and Mulligan 2019; Leydens & Lucena, 2018 in Almada, 2019; Kariotis & Mir, 2020; Hirsch et al., 2017). Before deploying a system, it can be tested, e.g. for potential bias, by applying the model to datasets with relevant differences (Ploug & Holm, 2020). Given the experimental nature of some AI systems, it may be very challenging to foresee all potential impacts beforehand, on the basis of tests in lab-like settings alone. In such cases, it may be useful to evaluate system performance in the wild using a “living lab” approach (Kariotis & Mir, 2020). In any case, development should be set up in such a way that feedback from stakeholders is collected before actual deployment, and time and resources are available to perform multiple rounds of improvement before proceeding to deployment (Hirsch et al., 2017; Vaccaro et al., 2019, 2020). Developers should seek feedback from stakeholders both with respect to system accuracy, and ethical dimensions (e.g. fairness, justice) (Walmsley, 2021).
Quality Assurance After Deployment
This practice relates to the AI system lifecycle phases following deployment. It is aimed at monitoring performance and creating a feedback loop to enable ongoing improvements. The design concept “procedural regularity” captures the idea that one should be able to determine if a system actually does what it is declared to be doing by its developers. In particular when models cannot be simplified, additional measures are required to demonstrate procedural regularity, including monitoring (Bayamlıoğlu, 2021). System operators should continuously monitor system performance for unfair outcomes both on individuals, and in the aggregate, on communities. To this end, mathematical models can be used to determine if a given model is biased against individuals or groups (Goodman, 2016 in Almada 2019). Monitoring should also be done for potential misuse of the system. Corrections, appeals, and additional contextual information from human controllers and decision subjects can be used as feedback signals for the decision-making process as a whole (Hirsch et al., 2017; Vaccaro et al., 2020). In some cases, feedback loops back to training can be created by means of “reinforcement learning”, where contestations are connected to reward functions. In decision-support settings, such signals can also be derived from occurrences where human controllers reject system predictions (Walmsley, 2021).
Risk Mitigation Strategies
This practice relates to all phases of the AI system lifecycle. The aim is to intervene in the broader context in which systems operate, rather than to change aspects of what is commonly considered systems themselves. One strategy is to educate system users on the workings of the systems they operate or are subject to. Such training and education efforts should focus on making sure users understand how systems work, and what their strengths and limitations are. Improving users’ understanding of systems may: 1. discourage inappropriate use and encourage adoption of desirable behavior; 2. prevent erroneous interpretation of model predictions; 3. create a shared understanding for the purposes of resolving disputes; and 4. ensure system operators along decision chains are aware of risks and responsibilities (Hirsch et al., 2017; Lyons et al., 2021; Ploug & Holm, 2020; Vaccaro et al., 2019, 2020).
This practice relates to all phases of the AI system lifecycle. Its purpose is to strengthen the supervising role of trusted third party actors such as government agencies, civil society groups, and NGOs. As automated decision-making happens at an increasingly large scale, it will be necessary to establish new forms of ongoing outside scrutiny (Bayamlıoğlu, 2021; Edwards & Veale, 2018; Elkin-Koren, 2020; Vaccaro et al., 2019). System operators may be obligated to implement model-centric tools for ongoing auditing of systems’ overall compliance with rules and regulations (Bayamlıoğlu, 2021). Companies may resist opening up proprietary data and models for fear of losing their competitive edge and users “gaming the system” (Crawford, 2016). Where system operators have a legitimate claim to secrecy, third parties can act as trusted intermediaries to whom sensitive information is disclosed, both for ex-ante inspection of systems overall and post-hoc contestation of individual decisions (Bayamlıoğlu, 2021). Such efforts can be complemented with the use of technological solutions including secure environments which function as depositories for proprietary or sensitive data and models (Edwards & Veale, 2018).
Contestable AI by Design: Towards a Framework
We have mapped the identified features in relation to the main actors mentioned in the literature (Fig. 2): System developers create built-in safeguards to constrain the behavior of AI systems. Human controllers use interactive controls to correct or override AI system decisions. Decision subjects use interactive controls, explanations, intervention requests, and tools for scrutiny to contest AI system decisions. Third parties also use tools for scrutiny and intervention requests for oversight and contestation on behalf of individuals and groups.
We have mapped the identified practices to the AI lifecycle phases of the Information Commissioner’s Office (ICO)’s auditing framework (Binns & Gallo, 2019) (Fig. 3). These practices are primarily performed by system developers. During business and use-case development, ex-ante safeguards are put in place to protect against potential harms. During design and procurement of training and test data, agonistic development approaches enable stakeholder participation, making room for and leveraging conflict towards continuous improvement. During building and testing, quality assurance measures are used to ensure stakeholder interests are centered and progress towards shared goals is tracked. During deployment and monitoring, further quality assurance measures ensure system performance is tracked on an ongoing basis, and the feedback loop with future system development is closed. Finally, throughout, risk mitigation intervenes in the system context to reduce the odds of failure, and third party oversight strengthens the role of external reviewers to enable ongoing outside scrutiny.