Data donations can provide great benefits, express and foster solidarity, and enable individuals to participate in scientific research. But they also raise some difficulties and puzzles.
4.1 Trust
One aspect that well-established practices of giving like financial, organ, or blood donations share with data donations is their reliance on trust. They function only if the donor can expect that her willingness to give will not be exploited by collectors and facilitators, that her donation is being handled responsibly and put to work effectively, and that no third-party interests interfere with the equitable distribution of her donation. The donor also expects that her contributions are being made against the backdrop of appropriate safeguards that protect her from harm, and that burdens arising from the donation process are minimized. Important questions arise about which institutional designs best promote that such expectations are met, trust does not erode, and the practice remains stable. Data donations presuppose trust in similar ways. One case in point is the backlash against the NHS care.data scheme in the United Kingdom, which was intended to enable the sharing of personal health data for research, but was met with distrust due to shortcomings in communication and transparency (Sterckx et al. 2016).
4.2 Future Use
The scope and timing of financial, organ, or blood donations is clearly defined. Donations of biological specimen can be sought with a reasonably well-defined purpose in mind, but already here questions loom about admissible future uses of such samples beyond the initially intended purpose. For example, after plenty of samples were collected to speed up research and development efforts during the 2014 outbreak of the Ebola virus disease in West Africa, question arose about how to use these biobanks responsibly in a way that provides long-term benefits for the health systems and scientific infrastructures of affected countries (Hayden 2015; World Health Organization 2015). Unlike organ and blood donations, biological specimen are not exhausted once they reach a beneficiary. They can be analysed repeatedly in a variety of study designs. To harness these potentials, regulators and researchers need to think carefully about consent mechanisms, the provision of appropriate information to sample donors, and mechanisms to govern access to the biobanks in which samples are stored.
One distinctive feature of data donations is that the possibility of future uses familiar from biobank donations is driven to the extreme. Consider the de- and recontextualization processes which datasets tend to undergo in the age of big data. Donated health data is likely to be processed and analysed by means of algorithms and applications that are designed to discover and examine unforeseen correlation hypotheses (cf. Mittelstadt and Floridi 2016, p. 312). From a normative perspective, this raises at least three issues.
First, the protective value of anonymization is limited. Some data, such as genomic information, is essentially personalized and cannot be anonymized. But even for other kinds of information, the possibility of de- and recontextualization entails that deanonymization cannot be ruled out. Giving data might be relatively convenient and effortless, but depending on the kind of data and context, such linkages can have quite significant consequences. Surprising inferences can be drawn from personal information especially once it is combined with and set in relation to other data sets. The problem is that individuals are less and less in a position to foresee and take into account potential harms and/or disadvantages that can accrue on the individual or the collective level.
Second, because future uses and possible inferences about the data subject are to some extent unclear at the point of data collection, it is challenging to design consent mechanisms that inform individuals appropriately. The problem is not just that non-experts lack the competence to foresee the possibilities of recontextualization and linkage with other data sets, and that this leads to deep asymmetries of information between data donor and users who have the expertise and technology to process it. At the point of donation, the range of possible recontextualizations, linkages, and inferences can remain inaccessible even to experts. In other words, the exact quality and character of the donation is in constant flux. The question arises how under these conditions, an individual can meaningfully deliberate upon whether or not to donate her health data. There is a tension between the very idea of making such a donation, and the fact that it must remain somewhat opaque to both donors and collectors what exactly is being donated.
Third, the availability of greater sets of data by itself does not guarantee improvements in the quality of data and/or the inferences drawn from it. The complexity of big data sets and the tools used to analyse them poses a range of epistemic challenges for data collectors and researchers that complicate the evaluation of big-data-driven hypotheses (cf. Mittelstadt and Floridi 2016, p. 327). The beneficent potential of data donations is directly tied to the scientific soundness of their analysis, processing, and conversion into research and development. Providing her data entitles the donor to reasonable expectations towards the scientific institutions whom she authorizes to use and leverage her donation, for example the expectation that her data is being used responsibly and effectively in a way that reflects her philanthropic intentions. These expectations will get frustrated if scientific virtues like rigour, care, and modesty are not enacted consistently throughout data collection, analysis, and interpretation.
4.3 Invasiveness
The implications about asymmetries of information become even more significant once we consider how invasive data can be in the age of big data, genomics, and continuous and holistic tracking. When we speak of data that can be donated, we are referring to a vast number of biological markers such as an individual’s complete and unique set of genetic information, physical parameters such as location and movements, lifestyle data, and even data about emotions, moods, and states of mind. Moreover, linkages amongst datasets lead to cumulative effects (Braun and Dabrock 2016a, pp. 316–7). First, the combination of clinical records with data from medical research, self-tracking technologies like fitness apps, lifestyle data, financial data, etc. results in levels of invasiveness which individual datasets do not achieve. Second, distinctions between seemingly discrete data kinds and spheres begin to vanish. The fact that companies like Apple, Google, and Microsoft are already active in all these domains underlines that linkages between them are only a matter of time.
The penetrative character of data and devices means that what they extract from us transcends concepts like parthood or possession. The German philosopher Helmuth Plessner (1980) has drawn a distinction between physical body (Körper) and living body (Leib). According to Plessner, one distinctive feature of human life is eccentric positionality, i.e. a particular mode of relating to its own positionality in space: humans can conceive of themselves as both physical bodies existing in the corporeal, outer world of things and as experiencing selves occupying the centre of a spatially delineated physical body, the locus of perceptions, actions, and experiences (cf. also de Mul 2014). Qua physical body, humans live, but qua living bodies, humans are subjects of experienced life. This double aspect is reflected by the two simultaneously instantiated modes of being a living body (Leibsein) and having a physical body (Körperhaben). In view of these concepts and distinctions introduced by Plessner, we might wonder whether, once individuals and their experiences are seen as complex conglomerates of algorithmic processes (for example Harari 2016 chs. 2, 10, 11), captured in their entirety by holistic, rich datasets and invasive devices, the difference between what we are and the features we have has collapsed. In this case, some kinds of data donations—the ones paradigmatically enabled by novel big data technologies—would involve much more than donating merely a part of me, or merely something about me. The question arises what about me is not being captured by data. As long as it remains unanswered, we are left with a sense in which the data donor can give all of her, all she is. The scope of the potential donation is unprecedented.
4.4 Ownership
In order to donate something, it must be mine. I cannot donate things that belong to you, such as your blood or organs. My personal health data is certainly about me, but is it also mine? Much seems to depend on the sense of ownership in question. For example, it is contentious whether personal health data can be seen as private property. Montgomery offers several reasons to reject the suggestion. He notes that in the context of health data, intuitions about privacy “sit uneasily with property ideas”: even if we commodify personal health data, “information ‘about me’ does not cease to be connected to my privacy when I give (or sell) it to others” (Montgomery 2017, p. 82). This suggests that ownership in the sense of private property is not primarily what motivates the regulation of health data.
Moreover, according to a broadly Lockean account, private property results from mixing labour with resources. This idea undercuts rather than supports the view that my health data is mine. While I might have “invest[ed] bodily samples” (Montgomery 2017, p. 83), it is the medical service provider who analyses specimen and data, compiles it into evidence bases, and generates value based on the raw materials I am providing. If labour is any indication, then “[i]f anyone may claim proprietary rights over the information on the labour theory of property, it would seem to be the health professionals or service for which they work” (Montgomery 2017, p. 84).
Montgomery suggests that if we really want to regard data like genomic information as property, it should not be considered private. One alternative is to regard such data as common property, i.e. property shared by a group of people (such as families) and outsiders being excluded. But Montgomery himself prefers the paradigm of public property: genomic data is like the air we breathe in the sense that everybody is entitled to it, the resource is not exhausted by universal access, and the benefits connected to its usage motivate obligations of stewardship and preservation.
We might have to complement such an account with the additional thesis that privacy- rather than property-related claims could still exclude access to personal health data, especially given the degree of invasiveness and comprehensiveness described above. What matters for our purposes is that data donations are disanalogous to other ways of giving in that they do not involve a transfer of something the donor owns in a straightforward way (on this issue, see also Barbara Prainsack’s contribution to this volume). In fact, as Montgomery also notes, data donations need not even involve a transfer: the data donor need not lose anything. Instead, her donation might be best understood as a suspension of certain privacy claims.
Considerations about ownership become highly relevant once calls for data donations are addressed not only at individuals, but also at data-processing organizations and institutions. In this context, data
philanthropy
refers to the provision of data from private sector silos for the public benefit, e.g., development aid, disaster relief efforts, and public health surveillance. Social media data can be key in the detection and monitoring of disease outbreaks. Organizations could share data of this kind not only on the basis of corporate social responsibility, but because they recognize the need for a “real-time data commons” (Kirkpatrick 2013). One necessary condition is that the privacy of individuals can be protected through measures like anonymization and aggregation. Even in cases where this is not possible, the hope is that “more sensitive data […] is nevertheless analysed by companies behind their firewalls for specific smoke signals” (Kirkpatrick 2011). Since such data is generated by the private entity, typically on the basis of some form of consent, there is a sense in which this entity is the owner. However, the owner and envisioned data philanthropist is not the data subject. It must be ensured that the interests of the latter are not compromised when data is being made available.
4.5 Affected People
In organ or blood donations, the identity of the beneficiary is often somewhat unclear: unless I am donating to a relative or friend, the recipient will be some indeterminate or unfamiliar other who is in need of the materials I am providing. Still, I have at least a vague idea about certain features and needs of the recipient, e.g., that she is in need of an organ. Something similar applies if I disclose personal health data for the benefit of people who share my illness or risk profile, e.g., on PatientsLikeMe. But note that once data is either decontextualized as described above or not being donated with such a specific purpose in mind, e.g., when uploading one’s genome on openSNP, the potential beneficiary and the way in which she benefits from the contribution become increasingly abstract.
Not only does the range of beneficiaries of the data donation broaden—it is also less clear who is carrying the burdens and consequences connected with the act of sharing. The donation of my kidney is a sacrifice which I make myself. Setting aside the beneficiary, the effects of my donation on others are minimal. In particular, any burdens related to the donation are carried almost exclusively by myself. In contrast, consider how submitting my genome to a public database could reveal information not only about myself, but also about my children or relatives, e.g., on hereditary risk factors. The range of people being affected as well as the precise consequences of the donation are much less transparent to the donor than in other health-related donations.
4.6 Voluntariness
Donations are conscious, deliberate, uncoerced acts of giving, informed by beliefs about a need that is being addressed through the donation. Data donations can be made by means explicit provision of information towards research projects and platforms, or by accepting terms and conditions of platforms that gather, evaluate, and maybe even publish data of its users (Kostkova et al. 2016). In any case, the informed will of the donor cannot be bypassed. In this context, at least two challenges arise.
First, there is a risk of opacity or even deception about the purpose of data gathering, especially if the sharing of data offers significant benefits to private sector service providers. The question arises how societies and individual donors choose to evaluate the activities of commercial entities who convert philanthropic data donations into products that might improve lives to some extent, but in the first place generate non-altruistic, self-serving revenues. For example, the biotechnology company 23andMe (2018) motivates customers to become “part of something bigger” and make contributions that “help drive scientific discoveries” by allowing the company to use data from its direct-to-consumer genetic testing services for research purposes. At the same time, 23andMe is generating intellectual property from its biobank, such as the patent of a gene sequence which it found to contribute to the risk of developing Alzheimer disease (Hayden 2012), and a method for gamete donor selection that allows prospective parents to select for desired traits in their future child (Sterckx et al. 2013).
Calls for data donations may allude to philanthropy, altruism, solidarity, and the good a donation can do, but in fact they might at least partly be driven by the self-interest of the data collector. The question of whether to share data in view of private sector benefits becomes particularly pressing in contexts where the latter conflict with the donor’s beneficent aims. For example, consider a situation in which data provision that is intended as philanthropic advances medical research while enhancing and stratifying insurers’ knowledge about risk profiles of donors and customers. Such prospects can ultimately deter individuals from sharing. If not, it provides opportunities for private sector entities to freeride upon philanthropic dispositions.
Second, the informed will of the potential donor can be challenged by apparent moral pressures. Understood charitably, headlines like “Our Health Data Can Save Lives, But We Have to Be Willing to Share” (Gent 2017) can be seen as raising awareness for so far unrecognized, readily available, and effort-efficient means for the individual to improve the lives of others. But there is a somewhat questionable flipside to such statements. They might be taken to suggest that an individual acts wrongly if she ultimately prioritizes her privacy over the presumed benefits of a data donation, and/or if she judges the privacy risks to be disproportionate relative to the utility that would be generated by her donation. In other words, a perceived duty to participate might result (Bialobrzeski et al. 2012). In view of rhetoric that declares data a common good and public asset, Ajana sees a risk of pitting data philanthropists against privacy advocates when
“in the name of altruism and public good, individuals and organisations are subtly being encouraged to prioritise sharing and contributing over maintaining privacy. […] First, it reinforces […] the misleading assumption that individuals wishing to keep their data private are either selfish and desire privacy because they are not interested in helping others, or bad and desire privacy to hide negative acts and information. Second, this binary thinking also underlies the misconception that privacy is a purely individual right and does not extend to society at large” (Ajana 2018, pp. 133–4).
A parallel can be drawn to worries regarding self-imposed surveillance and disciplining mechanisms (Foucault 1977) through self-tracking devices (Sharon 2017, pp. 98–99). Voluntary tracking and provision of personal health data can turn into liberty-constraining expectations that data is not only shared, but also that individuals take measures to improve their health markers (Braun and Dabrock 2016a, p. 323). The prospect of doing good with one’s data can similarly be turned into a disciplining narrative that conveys implicit expectations that data should not be withheld. What initially appears to open up options for the individual ends up delimiting them.
These dynamics would be unfortunate from a normative perspective. Data donations might be beneficial and morally commendable, and these features provide some reason to donate. But they hardly provide an all-things-considered reason—let alone a strict duty—to do so. Consider two examples: first, for the Kantian, the duty to help others is an imperfect one, i.e. it remains entirely up to the agent to what extent she helps others (Kant 1785, p. 423). Second, consider effective altruism according to which there are strong moral reasons to give, e.g., donating money to charity, organs to patients in need, or time and labour to good causes (Singer 2009, 2015; MacAskill 2015), but also to ensure that the good your efforts bring about is being maximized. To our knowledge, effective altruists have not yet explored data donations, but they could be intrigued by the benefits that can be realized through such acts of giving. Still, effective altruists agree that although once you donate, you should donate as effectively as possible, there can be optionality about whether to donate at all. Strong normative reasons to give money to charity can be outweighed by the costs such donations incur to the donor. In such cases, “it would not be wrong of you to do nothing” (Pummer 2016, p. 81). According to these positions, it is far from unreasonable or immoral if an individual decides to be restrictive about her data. It is a fine line between holding her contributions in esteem and implicitly sanctioning or generating a burden of proof for the individual who decides to keep her information restricted.
To sum up, donating personal health data offers alluring opportunities (3.), but a number of challenges lurk along the way. Genuine donors typically have some idea about what they are donating, what the donation will be used for, whom it benefits, and who carries burdens related to the donation. However, in big data contexts, potential data donors are bound to have a limited grip on the nature of their donation, the future use of their data, and the people affected by their decision to share. Further disanalogies come from the invasive and comprehensive character of state-of-the-art data gathering and processing, and the fact that the relevant sense of ownership is far from straightforward. Finally, the voluntariness of data donations can be undercut by opaque or deceptive information and/or moral pressures that appear to deflate individual privacy claims.
Earlier, we suggested that donations can advance positive data sovereignty as they foster social bonds and open up room for manoeuvre in social space. Specifically, we suggested that through data donations, individuals can enact beneficence, solidarity, and play an active role in scientific processes. The challenges just characterized aggravate the uncertainties that are inherent to any act of giving. Important aspects of the good being given are in constant flux—what it will be used for, whom it benefits, and who carries burdens. If the donor decides to give nevertheless, she embarks on a venture into the unknown that can become precarious. Not only might the donation be in vain, fail to accord with the donor’s intentions, and remain unsuccessful in advancing positive sovereignty. Even worse, the donation could backfire and end up compromising negative aspects of the donor’s sovereignty that relate to protective claims and rights, for example against untoward interferences from others, disadvantages, discrimination, or exploitation.