Keywords

Key Take-Aways

  • Data ethics management processes can be grouped into two categories: processes for spotting and escalating data ethics issues, and processes for reaching decisions about these issues.

  • Organizations use a number of different processes for spotting data ethics issues. These include a “hub and spokes” approach that places ethics professionals in the business units where they can spot issues and escalate them to the center; an external advisory group that sensitizes the organization to risks it might not otherwise perceive; and checklists that encourage engineers to consider and avoid actions that can lead to ethical issues; regular meetings at which employees discuss data ethics issues; and peer-to-peer discussions with data ethics managers from other organizations.

  • Organizations that prioritize speed tend to localize data ethics decision-making in an individual; those that prioritize deliberation tend to localize it in the cross-functional data ethics committee. Our research suggests that Silicon Valley-type companies that prioritize speed to market tend to vest data ethics decision-making authority in one individual who may have a direct line of communication with the C-Suite or CEO. By contrast, companies in more traditional sectors seem to rely on a cross-functional data ethics committee that, while slower, is able to provide multiple perspectives and reach a more considered, and perhaps higher quality, decision.

  • Data ethics committees follow different decision-making models. Some require consensus, while others follow a majority rule. Some have the power to make final decisions and even stop proposed projects, while others offer recommendations to the business units but do not hold the final decision-making authority. A company’s culture and management style will likely determine which approach it prefers.

  • Data ethics management programs vary in scope. Some focus only on the company’s own use of advanced analytics and AI, while others take a more systemic approach that encompasses suppliers and customers.

Putting the right structures and personnel into place is only the first step towards data ethics management. An organization also needs to establish the processes by which these personnel will interact with the rest of the organization, and with each other, to achieve the business’s data ethics goals. As the interviewees described them to us, these processes seemed to break down into two, somewhat overlapping, categories: processes for spotting data ethics issues and processes for reaching a decision about these issues. This chapter addresses each of these, in turn.

8.1 Processes for Spotting Data Ethics Issues

The interviewees described a number of issue spotting practices.

8.1.1 Touring the Business Units

Under the first, which we saw more in fast-paced, Silicon Valley companies, the team with primary responsibility for data ethics (e.g., data ethics office, or privacy office) largely assumed the issue spotting function. This team went out into the business units to meet with developers, learn about their projects, and help them to spot potential ethical issues. This model got the ethics team out of its office and into the business units, allowing it to problem-solve and address issues quickly. This may be why faster-paced, Silicon Valley-type companies preferred it. The disadvantage, however, is that it relies on a small group of individuals to spot ethical implications throughout the entire company and so can lead to important issues being missed. It does not scale. One ethics specialist explained just how challenging this can be:

[O]ur team is small, there are 12 of us trying to support 2,000 deployments all over the world. I am currently at this year: 250,000 miles on [airline]. We are stretched very thin trying to keep up with everything . . . So in terms of flagging issues it is very spotty, and ad hoc and one of our big worries is something is going to happen that we’re missing. And you think about code and how many million lines of code there is, how many complex, how many little decisions might actually have huge implications, it’s difficult to figure out how to scale it in a way that would systematically catch everything.” (Interviewee #10).

8.1.2 Hub and Spokes

The second approach was to place a junior privacy or ethics professional in each business unit. These professionals were trained to spot ethical issues and, where they such issues were significant and difficult to resolve, to refer them back to the central ethics team for further evaluation and resolution.Footnote 1 One interviewee referred to this as a “hub and spokes” model.

[P]rivacy reviews are initially conducted by a privacy manager, which is typically a non-lawyer, sitting in a privacy team within the business. So we have sort of a hub and spoke model, where we have distributed a set of privacy managers who are out there in the business. Close to the business decision makers, close to the engineers, doing the privacy reviews according to the processes and standards that have been developed at the hub, in the center, and distributed it out. They are supposed to flag those issues. And the high-risk issues will get escalated to a legal person, who may then further escalate them to one of the central subject matter experts. . . . So there's a process for initial review, sort of issue spotting escalation. And that often works. . . . [H]aving that process in place is invaluable in that we do get eyes on these things very early, at different levels (Interviewee #16).

This decentralized, hub-and-spokes approach seemed to scale better than the centralized one. It appears to be gaining popularity, particularly among larger, more established companies that have many business units in which such ethical issues might arise.Footnote 2

8.1.2.1 External Advisory Group

Some companies used an external advisory group to spot issues. Such a group—made up of privacy advocates, academics, industry people, former regulators, and others—gave the ethics officers a sense of what others might find troubling and so increased their sensitivity to potential ethical concerns. One referred to this as “pressure test[ing]” the company’s future data practices from an external perspective.(Interviewee #9). Consulting with the external advisory group also gave the ethics team a way to gauge public expectations and so, consistent with the risk mitigation approach to data ethics, align the company’s data practices with these expectations.

In some instances, the external advisory group provided the ethics team with additional leverage for advocating its views within the company. As one privacy and ethics leader put it, “we needed backup. We needed a credible group of people who could provide the really solid [feedback], who we could point to and say ‘look, they agree with this analysis’.... So that’s what it was initially formed as.... that’s the sort of network we built up to do that... primarily academics, but we also wanted to get advocates in there.” (Interviewee #10).

Some companies established sitting, external stakeholder committees. For example, one set up an external advisory board that included leading privacy advocates and academics. This board met regularly during the year and corresponded on a more ad hoc basis through emails. The interviewee explained that “they’re under an NDA, [so] we can bounce ideas off them, we can show them deployments, we show them technologies, and get their feedback, so that catches things we might have missed, or gives us a perspective from outside the company which is very helpful.” (Interviewee #10).

Other companies used a more ad hoc approach, convening groups of stakeholder experts to address particular issues when they arose. “We have the ability to contact consultants and people on the outside... and say, ‘We’re tackling with this issue, can you help us review this?’ When do we do it?... [We do it] when we feel like the project is about something that we do not have in-house expertise in. And literally, if we feel like we’re probably not the right people to review this, then we can go external.” (Interviewee #14). This additional input can be helpful. For example, one interviewee recounted a time “when [a company that ran an Internet search engine] wanted to know if it was a good idea to give people the option of sharing all their searches on Facebook. And so they convened a consumer panel. They said it would be purely voluntary, but should we even allow it as an option? And the panel unanimously said no – you shouldn’t allow people to trap themselves, because while they think there isn’t any harm in that, you can come up with a parade of horribles from sharing your searches on Facebook.” (Interviewee #23).

At the time of our research, the use of external data ethics advisory committees remained relatively uncommon. Even among the companies represented in the survey sample, only eleven percent utilized an external advisory committee for this purpose (Table 8.1). That said, the absence of a formal external committee did not necessarily mean companies were not seeking external insight informally.

Table 8.1 Does your company use an external advisory committee?

Companies do need to be thoughtful about who they appoint to such external bodies. Google’s appointment of a polarizing figure to such a group provoked such an adverse reaction that the company had to disband the group a week after creating it (Waters 2019).

8.1.2.2 Checklists

In his book The Checklist Manifesto: How to Get Things Right, Gawande (2009) popularized the idea that checklists can be a useful way for organizations to get their people to operationalize broad concepts and apply them consistently. Many industries and professions, including medicine, aviation and structural engineering, use them for this purpose. The interviewees indicated that some organizations are beginning to use data ethics checklists in order to get employees to operationalize and apply data ethics principles (Interviewee #19). One interviewee described their company’s instrument as a “set of interrogatories that we're developing right now to get in front of the analytics teams that are going to be asking for data. It's based on some of [our data ethics] principles, but they're very simple questions, and they're more reflective. They get folks to think [about data ethics issues] before they take the deep dive into the data.”(Interviewee #16).

The companies in our sample are still at an early stage in their development of data ethics checklists and were not able to make them available to us. A 2020 Microsoft Research article offers a resource for companies or policymakers interested in seeing what such a checklist might look like (Madaio et al. 2020). These researchers, which included a Carnegie Mellon Ph.D. candidate, conducted semi-structured interviews with fourteen data analytics practitioners to get a general sense of what these data scientists would look for in a data ethics checklist. They then engaged in an iterative process with 48 practitioners working on a variety of AI systems to co-design a model AI Fairness checklist.

The Microsoft Research team’s interviews resonated in some ways with our interview findings. Practitioners explained to the Microsoft Research team that they found abstract data ethics principles to be hard to put into practice. They viewed checklists as a way to operationalize, and make more concrete, abstract concepts such as AI fairness. The practitioners also highlighted a potential downside to using checklists: they can breed a compliance-oriented mentality in which employees check the required boxes without engaging with the nuanced and context-based questions that data ethics issues often raise. In their view, checklists were best used to initiate reflection and conversation about issues such as fairness, bias, manipulation or transparency, rather than to provide discrete technical actions that engineers must follow. This fits with our finding, described above, that companies are coming to see data ethics as a strategic activity focused on improving the customer experience and building trust, and not as a compliance function.

The model AI Fairness checklist that the Microsoft Research team and practitioners co-designed, and that appears at the end of their article, consists of questions to consider, actions to take and items to document at six distinct stages in the product development process: (1) Envision (envisioning or greenlighting meetings); (2) Define (spec or design reviews); (3) Prototype (go/no-go discussions and code reviews); (4) Build (ship reviews); (5) Launch; and (6) Evolve (product reviews). Consisting of six sections, and running almost six pages, the checklist is quite long. But it becomes easier to comprehend when one realizes that it contains several core themes that are repeated throughout the various stages. These are:

  • Identify those whom the AI system in question might impact, including particular demographic groups;

  • Examine the types of fairness-related harms that the AI system might impose on such stakeholders (e.g., allocation, quality of service, stereotyping, denigration, over- or underrepresentation), how these compare to the system’s benefits, and whether there are trade-offs between particular fairness criteria.

  • Scrutinize and clarify definitions—of system architecture, datasets, potential fairness-related harms, fairness criteria and metrics—and revise them as necessary to mitigate any fairness-related harms.

  • Solicit input from a diverse group of reviewers and stakeholders regarding vision, potential harms, definitions, fairness criteria, datasets, etc.

  • Where feasible, test the product with these diverse reviewers so that they can better understand and provide feedback on them.

  • Monitor product implementation for deviation from expectations and for anticipated or unanticipated fairness-related harms.

  • Revise the vision, definitions, datasets, fairness criteria, prototype, etc. in order to mitigate potential harms.

  • If it is not possible to mitigate the potential harm, explore and document why this is the case, future mitigation or contingency plans, and whether it makes sense to proceed with the project at all.

  • Revise the system at regular intervals to improve its fairness performance and take account of changing social expectations or norms.

8.1.2.3 Sparking Discussion About Data Ethics Issues

Interviewees explained that regular reflection on and discussion of data ethics issues can help to build a culture in which people throughout the organization are more likely to spot and raise such issues. The idea is that developers and others need to become sensitized to these issues in order to be able to identify them, and that group discussion is an effective way to build this awareness.

Companies go about building this sensitivity and data ethics culture in various ways. One data ethics manager described their practice of circulating articles and other reports about data ethics incidents, concepts and solutions. “I'm really big on any article I get on data ethics, distributing it broadly,... These are what typically would be a garden variety way of communicating with people. But we're customizing it for data ethics. That's part of my ask from our leadership when they said, “How are we operationalizing this?” Communications is one of my performance goals, actually, so I'm working on it.” (Interviewee #16).

The companies that we spoke with had not yet fully developed their techniques for initiating data ethics discussions in their organizations. However, the Omidyar Network has released a toolkit for sparking such data ethics discussions: the Ethical Explorer Pack.Footnote 3 This toolkit goes well beyond the ethical issues that business use of advanced analytics and AI can raise (the focus on this book) and considers a much wider range of data ethics risk areas. But organizations could adopt its approach for the ethical risks that their use of advanced analytics produces.

8.1.2.4 Peer-To-Peer Conversations

In a sign of just how important companies find ethical issue spotting to be, interviewees reported the emergence of informal, peer-to-peer, conversations to talk about ethical risks and how to address them. One interviewee who works in the Bay Area described off-the-record meetings of twenty or so privacy professionals to discuss the risks associated with advanced analytics and how best to deal with them. “The whole point is to really have a very genuine conversation about the topic, and a lot of people have started to convene them.... there's a lot of interest and activity around wanting to have these really genuine conversations.” (Interviewee #2).

8.2 Processes for Deciding Data Ethics Issues

Once a company has spotted an ethical issue, the next step is to make a sound decision about it.

8.2.1 Just in Time Data Ethics

Where senior privacy or ethics executives go out into the business units to spot issues, they can often decide even difficult issues right away. This is the fastest approach. However, it quickly runs into resource constraints. “[T]he challenge is obviously that’s not a process that scales very well. The bigger we get, the more difficult it is to have that in any consistent and meaningful way. So, it’s an incredible challenge.” (Interviewee #10).

8.2.1.1 Triage and Escalation

The majority of companies that we spoke with employed the hub-and-spokes approach to issue spotting in which a junior person, located in the business unit, identifies issues and refers the hard ones back to the center. Such companies empowered the junior person to make decisions about relatively straightforward ethics issues that arose in their unit, perhaps after a quick consultation with the legal department. However, they required the person to escalate more complex, grey area issues to the more senior and seasoned decision-makers at the center.Footnote 4 One ethics lead analogized this triage and escalation approach to Institutional Review Boards that declare projects that raise few ethical issues to be “exempt” after only cursory review, and that reserve full IRB review for the more ethically complicated proposals (Interviewee #14). A leading consultant described it as “a basic risk assessment process that has escalateable decision points relative to the commensurate level of risk.” (Interviewee #7). A third interviewee used a medical analogy:

If you think about the concept of assessments, it's like a triaged process in an emergency room of the hospital. Somebody comes in, they have cuts and scrapes, I can deal with the cuts and scrapes, I do not have to escalate to a doctor. I don't need to escalate that to the operating room. You have other people come in and they have broken bones that have to be set by a doctor, so you move to a second level of assessment to determine what is the right treatment level. You have a third level, a fourth level, then you have a level where the issues require assessment by a full range of people who have multiple skills, who will then decide whether what's being done is legal, fair and just (Interviewee #1).

8.2.1.2 The Role of the Data Ethics Committee

Once a complex ethical issue gets escalated, who decides it? Here, again, we see a distinction. Some companies, particularly Silicon Valley firms that emphasize speed and innovation, authorized a senior privacy or ethics official to make these calls. In at least one such company, this official was able to engage the CEO directly when necessary to reach a resolution (Interviewee #10). This yielded a quick, streamlined process in which the senior data ethics officer, backed by the CEO, were empowered to make decisions on behalf of the company.

Most of the companies we spoke with, however, established a cross-functional data ethics committee described in Chap. 7, rather than an individual, at the center of the decision-making process. Where privacy or ethics managers in the business units confronted difficult or novel issues that they could not comfortably resolve themselves, they referred it to such a committee.

Many of the more sophisticated organizations . . . have started to set up these ethics review boards within the organization. So, it's not just about compliance. It's about thinking through these broader sets of data uses and thinking about whether or not they are meeting the company's standards for appropriate data use, if you will. Those tend to be more focused on areas of the business that are more likely to need them, so analytics groups . . . [or] research groups within organizations (Interviewee #22).

In one illustrative example, a data ethics committee considered whether the company should sell information technology to a customer who might, in turn, share it with the Chinese government for use in surveilling its population (Interviewee #2).Footnote 5

The data ethics committee generally operated by consensus, with all members required to agree before an ethically challenging project can move forward (Interviewee #17). The group might tweak the project until all members were comfortable with it.Footnote 6 Some data ethics committees had the power to cancel projects or contracts where the committee believed that the risks are too high. Others were empowered only to make recommendations to the business units. In one important example of the former approach, publicly reported in the media, Microsoft’s AI and Ethics in Engineering and Research (AETHER) Committee vetoed major sales contracts on ethical grounds and put significant limits on others.Footnote 7

Companies that wish to create a data ethics committee would do well to consider thoughtfully some important design choices. (Sandler and Basle 2019). These include:

  • What types of expertise does this particular company need on its data ethics committee? Which perspectives are most important?

  • Should the committee be able to consult with and get input from an external advisory group?

  • Where should the committee be located within the organization? Privacy? Legal? Risk management? Strategy?

  • To whom should it report? This person needs to be sufficiently high in the corporate hierarchy for the committee’s judgements to carry weight.

  • What standards should the committee use in making its decisions?

  • Should the committee have the power to cancel projects or contracts, or only to make recommendations?

  • Should the committee operate by consensus, or majority vote?

  • How should issues be elevated to the committee? What process should be followed? What materials should be provided for committee consideration?

  • How does the company define success for this committee? More ethical products? Fewer “incidents” that damage reputation?

8.3 Broader Themes

Several broader themes emerge from the interviewees’ statements about management practices.

8.3.1 Streamlined Versus Deliberative

To begin with, one can see two basic corporate data ethics management approaches. The first is quick and streamlined. It sends decision-makers out into the business units where they spot issues and make just-in-time data ethics decisions. Where these executives do escalate thorny ethical problems back to the center, they come directly to a senior decision-maker who has a direct line to the C-Suite or CEO and is able to reach quick decisions about even the most complex issues.

The majority approach, however, is more deliberative and structured. It involves a hub-and-spokes approach to issue spotting; triage and escalation with respect to issue resolution; and a cross-functional data ethics committee to consider and reach decisions about the most difficult ethical issues, sometimes with input from an external advisory board. We loosely characterize these as “streamlined” and “deliberative” approaches to data ethics issue spotting and resolution.

Based on the interview data, we hypothesize that faster-paced, Silicon Valley-type companies tend to utilize the streamlined process. This gives them speed. However, it both increases the risk that the company may fail to spot certain ethical issues and arguably decreases the thoroughness, and so the quality, of the company’s ethical decision-making. By contrast, more established companies appear to prioritize decision-making quality over speed. They insist that privacy or ethics officers in the field escalate difficult ethical decisions to the more senior executives at the center. They build a cross-functional data ethics committee to deliberate on and decide these complex issues. This takes longer. But it ensures that each decision is the product of a sustained, multi-perspective debate which can, in the most difficult cases, include referral to and input from an external advisory board. This should yield higher quality decisions.

Based on our rather limited interview sample, we further hypothesize that companies in the most highly regulated industries (e.g., heath care, pharmaceutical, financial, transportation, etc.) are more likely to have deliberative ethics decision-making systems, whereas those in newer, technology-oriented industries disproportionately adopt the streamlined approach. This may be because highly regulated companies are able to take legacy management structures developed for existing regulatory requirements and adapt them for the beyond compliance data ethics function.

Finally, we anticipate that the deliberate approaches will narrow the speed gap when compared to streamlined ones. They are likely to take the precedents that their cross-functional ethics committees produce and turn them into guidance for “spoke” decision-makers operating in the business units. This will, over time, enable the dispersed decision-makers to make more decisions, while referring fewer issues back to the committee itself. The speed differential between streamlined and deliberate decision-making processes should thus reduce over time, while the quality difference will remain. This should lead companies, even those in fast-paced industries, to prefer the more deliberate approach over the streamlined one.

8.3.2 Internal Versus System-Wide Focus

The interview data also suggests another important divide in corporate data ethics management processes. Some companies focused their data ethics efforts on their internal operations. Others looked not only at what the company itself is doing, but at the behavior of its suppliers and customers. They sought to achieve data ethics throughout the entire production system and value chain of which they are part.

Data ethics started with an internal focus. Soon after corporations began widely to use “big data” and advanced analytics, academics and privacy managers analogized this corporate activity to human subjects research in the university context. In an influential 2013 article, Professor Ryan Calo argued that companies should establish Consumer Subject Review Boards that would serve the same vetting function as Institutional Review Boards do in the university context (Calo 2013). This article helped to frame corporate data ethics management as a kind of private sector IRB focused on the company’s own “human subjects” research.Footnote 8 One interviewee recounted that, as they started to build their company’s data ethics process, “I really thought about the IRB model.” (Interviewee #14).

Several interviewees expressed concerns about using the IRB model for data ethics. For one thing, IRBs in the university setting are notoriously slow. “It's not a fast and flexible system, and in the world of data driven applications, a month can be a killer for a project.” (Interviewee #1). Second, an IRB faces internally. It focuses on and considers research projects that bubble up from the company itself. That is a vital function. But, according to some interviewees, it is not sufficient. In today’s connected world, an organization’s misbehavior profoundly impacts its business partner’s. “You could have everybody doing the right thing, and you introduce one party into that process, whether it's the supplier of certain data or a processor that does a piece of the whole, and the weakness in that link is what's going to bring the whole thing down. The reputational impact... forget the compliance impact or the business continuity impact or investment impact.” (Interviewee #21).

This same interviewee explained that, to account for important risks and protect its own reputation, a company’s data ethics initiative must extend beyond its own ranks to include all entities in its value chain. It must seek to “ensure that each link in a chain or each part of the solution that's provided, that either contributes to or benefits from the predictive analytics, has to subject themselves to a certain competency and a certain set of diligence and a certain moral or ethical commitment to be part of that chain or ecosystem.” (Interviewee #21). This suggests a distinction between those companies that focus their data ethics initiatives internally through an IRB model or otherwise; and those that take a system-wide approach that includes their suppliers, business partners and, in some cases, even customers.

Thus far, this narrative has focused on the management standards, structures and processes that organizations use to spot and decide difficult data ethics issues. But there is another side to data ethics management that focuses more on technical solutions to bias, opacity and other risks that innovative uses of data, advanced analytics and AI can generate. The next chapter conveys what we learned about technical solutions to data ethics challenges.