1 Introduction: Environmental DNA and ‘Human Genetic Bycatch’

Humans, just like all other organisms, shed DNA everywhere. Environmental DNA is DNA from organisms (i.e. genetic material originating from, for example, hair, skin, faeces, micro-organisms, pollen, or spores) that is present in an environmental sample, such as air or water (Beng & Corlett, 2020; Bohmann & Lynggaard, 2023). DNA can last in the environment for short or very long periods, even thousands of years in permafrost (Beng & Corlett, 2020).

In May 2023, it was reported that human genomic information can be captured relatively easy and inadvertently from environmental samples, such as those collected for wildlife-focused studies (Whitmore et al., 2023). Recovering human DNA ‘from thin air’, as the New York Times phrased it (Brown, 2023), could be used to identify and phenotype human individuals using so-called shotgun sequencing (Whitmore et al., 2023). This news was brought by a multi-national research team that used environmental DNA to study green sea turtles (Whitmore et al., 2023). Young sea turtles leave DNA when they crawl their way to the ocean from the beach (Whilde & Farrell, 2023). However, the researchers also found traces of human DNA in many samples: ‘human genetic bycatch’. In some wild water samples, the levels of human-aligning reads that were detected were almost as high as that of sea turtles (Whitmore et al., 2023). The authors state that identifiable human DNA can be intentionally captured from water, sand, and air.

Interest in environmental DNA has rapidly increased over the past few years because of the COVID-19 pandemic: extensive research has been conducted on the detection of human pathogens from wastewater (Gable et al., 2020; Keshaviah et al., 2023). Also within wildlife studies, environmental DNA is increasingly used because of its non-invasiveness for animals, its cost-effectiveness, and its relative simplicity (Beng & Corlett, 2020; Whitmore et al., 2023). Environmental DNA has been obtained from all kinds of samples, such as air, soil, sand, freshwater, seawater snow, and permafrost (Bohmann & Lynggaard, 2023; Whitmore et al., 2023).

Genetic sequencing technologies have been massively improved over the past years. As these methods develop rapidly, human environmental DNA is also increasingly found. The study of Whitmore et al. found human environmental DNA in many of their samples, even in air samples collected from a sterile veterinary hospital environment (Whitmore et al., 2023). From human DNA that is sequenced from environmental samples, one can potentially extract all kinds of information. For example, the study by Whitmore et al. was able to identify the genetic ancestry within pooled human populations as well as genetic variants that are associated with disease risks (Whitmore et al., 2023).

The varieties of possible current and future uses of human environmental DNA are vast: human environmental DNA could be used for e.g. non-invasive monitoring of pathogens, criminal investigations, continual health monitoring of chronic diseases, source identification of polluting waterways, or locating archaeological sites (Whitmore et al., 2023). Alongside these possible beneficial applications, there is also, as Whitmore et al. point out, a long list of potential problematic uses and impacts of human environmental DNA, such as the lack of human subject consent, the possible requirement of human-study ethical approvals for wildlife studies, harvesting human genomic data from populations without their consent, or genetic surveillance such as locating ethnic groups (Whitmore et al., 2023).

Related to the broad range of possible applications, this human genetic ‘bycatch’ raises ethical concerns (Doi & Kelly, 2023; Ram, 2023; Whitmore et al., 2023). In the following, I will discuss some key ethical issues that environmental DNA brings about. Moreover, I will show that several relevant analogies can be made between genetic data and digital data, and that these could be helpful to think about the ethical, legal, and policy issues that environmental DNA raises. I argue that insights from digital data protection, in particular alternatives to individual control over data, could be a valuable starting point for bioethical debate on human environmental DNA.

2 Ethical Issues of Human Environmental DNA

The ethical and governance issues of human environmental DNA will often be strongly connected to the different purposes for which environmental DNA can be used. In the following, I will discuss some of the main ethical issues environmental DNA raises, including consent, privacy, commodification, and genetic surveillance. Specifically, it becomes clear that many of the ethical issues of environmental DNA seem not to lie on the individual level, but on the level of broader implications for groups and society. Therefore, I will subsequently discuss some specific approaches from digital data ethics that look beyond individual data control, including predictive privacy and group privacy, and explore to what extent they may be helpful for the debate on environmental DNA.

First, environmental DNA raises complex issues regarding consent. The governance of genetic data is generally often based on informed consent: an individual has to give some form of consent before their genetic information can be used for research or other purposes. However, when it comes to environmental DNA, consent faces multiple difficulties. The first is quite obvious: when human DNA is floating everywhere in the environment, obtaining prior informed consent from people before starting an environmental DNA study is simply practically impossible as Whitmore et al. (2023) also state. Thus, because human environmental DNA is present in many places (and can also travel across long distances), it is probably not feasible to obtain consent of all individuals. Second, even if it were possible in some cases, individuals simply do not have control whether or not they shed their DNA. In many cases, people will perhaps not even know that they are leaving behind traces of DNA. It also raises questions regarding in which situations consent will need to be obtained. As has been discussed above, there are many potential usages of environmental DNA and to all of these different situations other requirements of consent may be called for. For example, in the hypothetical case of police investigating a robbery in a large department store with the help of environmental DNA, would consent need to be obtained from all customers before law enforcement may sequence the human environmental DNA?

However, aside from these aspects, there is a more fundamental issue with consent when it comes to environmental DNA. Namely, it is questionable that consent can serve as a basis for ethical governance of environmental DNA, even if we accept for the sake of the argument that it would be feasible to obtain consent. The most important reason is that information about others (who have not given consent) can be derived from this data.Footnote 1 Namely, large parts of our genetic information are often shared with many other individuals. Individuals share half of their genetic data with their siblings, children, and parents. This shared nature of environmental genetic material as well as its omnipresence makes the governance of environmental DNA very complex. In other words, when it comes to human environmental DNA, this is not only about the individual, but can also affect relatives and broader groups, because we all have parts of our genetic information in common with our (close and very distant) relatives.Footnote 2

Another relevant issue of environmental DNA besides consent revolves, thus, around group interests. Data from environmental DNA could potentially be used to predict information about other people, in particular about people with whom the consenting person has a genetic similarity. In other words, environmental DNA raises issues on an individual level (e.g. information about genetic ancestry, disease risks, and identification could be derived from this environmental DNA) but also on levels beyond that individual, including groups. For example, it could, hypothetically, become more easy to ‘circumvent’ ethical guidelines on genomic research of Indigenous populations. It has been argued that genomic research regarding Indigenous populations should be done not about Indigenous populations, but rather for, by, and with Indigenous populations (Tsosie et al., 2021). Human environmental DNA could potentially be used as a way of circumventing adherence to ethical guidelines that govern genetic research. In that respect, it might be useful for the debate on environmental DNA to learn from insights from the debate on the ethical issues that are raised by ancient DNA, such as the implications of genomic research of ancient DNA both for individuals and for communities (Cortez et al., 2021).

Also, there is a risk that companies will monetize human environmental DNA in some way (Whitmore et al., 2023). Direct-to-consumer genetic testing companies may, again hypothetically, use human environmental DNA to make inferences about certain populations of which they have not yet ‘fine-tuned’ their genetic ancestry testing, for example because not ‘enough’ people from a certain ethnic background have done a genetic test with the company. Direct-to-consumer genetic companies base their genetic ancestry services partly on the data they have collected from their customers and the information these customers share about their characteristics such as ethnicity. Environmental DNA may in the future be used to strengthen predictions and infer information about certain groups. These are all still hypothetical possible future uses of human environmental DNA, but nevertheless it is important to explore in detail the potential uses in order to timely and effectively evaluate the ethical and legal aspects.

Environmental DNA raises also concerns about genetic surveillance and possible criminal investigative uses (Ram, 2023; Whitmore et al., 2023). These concerns might be especially urgent in the USA, because, as Ram (2023) has pointed out, many US lower courts have held that people do not have constitutional privacy rights when it comes to DNA they have unintentionally left somewhere (for example on a coffee cup). When it comes to possible uses for criminal investigations, a recent study found that human DNA can be retrieved from air in amounts needed to provide an STR genotype, which is the type of forensic DNA profile that law enforcement uses (Fantinato et al., 2022). Environmental DNA could also be used for genetic surveillance — or to make genetic surveillance more simple. Human environmental DNA could have implications for genetic surveillance at least in two ways. First, human environmental DNA could be used to locate people from certain ethnic groups (Whitmore et al., 2023). Second, human environmental DNA could be used for genetic surveillance ‘research’ purposes in the sense that the results can be used for e.g. phenotype (i.e. physical appearance) identification of certain ethnic minorities. Notably, the possibilities for genetic surveillance will not depend on extremely complex technologies: it was reported that researchers used a very small sequencing device that can be plugged into a laptop and costs only around $1000 USD (Brown, 2023).

This brief discussion of some of the main ethical issues of environmental DNA shows that many of these issues expand beyond the individual and concern interests on the level of groups and society. In the next section, I will discuss several significant similarities between data from environmental DNA and online digital data and will explore to what extent non-individual-based approaches in the digital data protection literature may be helpful for the discussion on environmental DNA.

3 Shedding DNA in the Environment, Shedding Data Online: a Parallel

As we have seen, many ethical issues that environmental DNA raises are reaching beyond the individual. Therefore, although this paper will not offer a complete answer how to effectively govern environmental DNA, it suggests a possible direction: to look beyond individual data control, so beyond the individual’s informed consent. Such an approach regarding environmental DNA seems particularly urgent, as it has been argued that when it comes to genetic information in general “it may be time to shift attention from attempting to control access to genetic information to considering the more challenging question of how these data can be used and under what conditions” (Clayton et al., 2019). In the following, I discuss some relevant analogies between digital data and data from human environmental DNA and explore some specific approaches from digital data ethics that look beyond individual data control, including predictive privacy and group privacy.

Traditional approaches to research and governance of human genetic data are often based on consent: an individual has to give some form of consent before their genetic information can be used for research or other purposes. As we have seen, to focus on informed consent or individual data control has — although often relevant within the healthcare context of genetic data processing — significant limitations when it comes to environmental DNA. Broadly, these limitations amount to (1) practical difficulties to obtain consent, and (2) the consent of one individual has implications for others. In this respect, an analogy can be made with digital big data with both of these two aspects. In terms of the first aspect, as we shed online big data traces everywhere we go (just as we shed environmental DNA traces), complete individual control will often simply not be possible. Nissenbaum has critiqued the way in which debates on information technology often quickly turn to the ‘consent model’, which is visible through the predominant ‘accept cookies’ buttons in the online sphere (Nissenbaum, 2011). Nissenbaum argues that this predominant approach has arisen from the idea that the right to privacy is synonymous to the right to control information flows that concern yourself (Nissenbaum, 2011). She argues that the consent model is unlikely to improve privacy, because it is practically impossible to inform online users about every data particle that is collected, processed, and analysed about them. This is also the case for human environmental DNA: as we leave DNA everywhere, it is simply not feasible to control this on an individual level.Footnote 3

The second aspect of the analogy between environmental DNA data and digital data regarding the limitations of consent concerns the fact that consent of one individual affects other people. In that respect, it is useful to look at debates in digital ethics. Debates within digital data protection and privacy have for a long time critically questioned whether consent or individual control should remain a central element in digital data protection (Mühlhoff, 2023; Nissenbaum & Patterson, 2016; Sætra, 2020; Solove, 2008; Yeung, 2017). Because of the possibility to infer or predict information about many digital consumers who have not consented themselves, Mühlhoff (2023) has argued for protecting ‘predictive privacy’. Predictive privacy results from the insight that voluntarily disclosed data may harm other people and that it is somewhat outdated to have one digital user controlling which information is collected, when this information can predict things about other people (Mühlhoff, 2023). According to Mühlhoff, one’s predictive privacy is violated “when personal information about them is predicted without their knowledge and against their will based on the data of many other people” (Mühlhoff, 2023).

The notion of predictive privacy is very relevant for the debate on environmental DNA. The genetic data could potentially be used to infer information about direct family members, but also about broader genetic groups the individual is a part of. In this context, discussions on group privacy (Puri, 2023; Taylor et al., 2016) within the digital data debate could be helpful for the discussion on environmental DNA. Environmental DNA could potentially divide people into different ‘groups’: these groups may map to or correlate with some recognized social group (such as an ethnic group), but it could also be the case that genetic groups do not map to a specific group.Footnote 4 When it comes to genetic groups that are created by environmental DNA processing that do not correlate with socially recognized groups, the ethical debate on environmental DNA may learn from the discussion on ‘algorithmic groups’ (Mittelstadt, 2017; Wachter, 2022). Algorithmic groups are groups that do not align with characteristics or attributes that are protected by privacy or anti-discrimination legislation, such as ethnicity or sex (Mittelstadt, 2017; Wachter, 2022). For the debate on environmental DNA, we have to explore possibilities beyond individual approaches to privacy and individual data control and examine to what extent group-based privacy rights and collective privacy rights could be relevant to environmental DNA.

Another relevant issue environmental DNA raises is the limitation of anonymizing this data. First of all, the distinction between identifiable and ‘de-identified’ human genetic data becomes increasingly blurred, because ‘de-identified’ genetic data can relatively easily be re-identified (Ram, 2023). In online big data, this distinction between ‘anonymous’ and ‘personal’ data has also been challenged, because anonymous digital data can similarly be re-identified through other data. Moreover, the identifiability of digital data can change over time due to novel technological developments (Wachter & Mittelstadt, 2019). The same is true for data derived from environmental DNA: it is not clear how the quantity and quality of collecting, processing, and analysing data from environmental DNA will develop in the future.

Yet, even if we would find some way to ‘anonymize’ data from environmental DNA, there remains still a more fundamental problem. This problem has also been addressed in the online data ethics literature, namely that even with anonymised data, groups of individuals can still be identified and targeted (Floridi, 2014). For that reason, it has been argued within the context of online big data, that the distinction between ‘anonymous’ and ‘personal’ data is outdated (Mühlhoff, 2023). Yet, the EU General Data Protection Regulation (GDPR) does not regulate anonymized data (Wachter, 2019).

Also, in terms of governance, it is unlikely that isolated national governance will be very effective when it comes to environmental DNA. First, environmental DNA can travel for very long distances, even across borders. Second, genetic relations do not stop at countries’ borders. Therefore, probably we will need some kind of collective, transnational governance for environmental DNA (just as the European Commission has recognized the need for collective governance in their recent efforts to regulate Big Tech).

In sum, we leave genetic traces everywhere just as we leave big data traces everywhere. When it comes to human environmental DNA, insights from online digital data protection debates could be helpful, in particular to articulate issues around the fact that when human environmental DNA is captured, it is not only about the direct individual harms of the individuals from whom the DNA originated, but can also have implications for broader groups of people (such as in the case of genetic surveillance or genetic research of Indigenous populations). As environmental DNA has arguably some relevant similarities with digital data, conceptualizations from the digital data debate such as group privacy and predictive privacy could be a starting point to address the complex ethical, legal, social, and political aspects that environmental DNA raises.

4 Conclusion

Our DNA floats everywhere — capturing it, therefore, potentially affects us all. I have discussed several ethical issues that human environmental DNA raises. This paper suggests that insights from online digital data protection can be a valuable starting point for ethical, legal, and policy discussion on human environmental DNA. This attempt is by no means exhaustive: it is likely that many other insights from the digital data debate, as well as other fields, could be helpful for the debate on environmental DNA. As we leave genetic traces everywhere (just as we also leave digital traces everywhere online), seeking individual control over all of that information seems not only practically hardly possible, but also ethically problematic. Therefore, we need new approaches on how best to articulate and address the challenges that novel genetic information processing techniques bring about. An interdisciplinary approach — that looks beyond individual data control — is therefore needed that can effectively deal with the complex governance issues that environmental DNA raises.