1 Introduction

“Sharing is caring” is not only the mantra advocated by the fictional company The Circle in the novel and movie by the same name, but also the sentiment that is the foundation for data commons. The basic idea is that sharing certain goods is ethically good, as those goods ought to be outside private ownership and shared by everyone in the common, and that access to and consumption of goods in the common shall be governed and regulated to ensure fair use and prevent exploitation. Mountains, fish, and the military are frequently used as examples of common goods—and General Public License and Creative Common license are examples of legal frameworks that move creative works, software, music, scientific knowledge, images, etc. from private ownership regimes into the commons. Institutions like public libraries, national park systems, and Wikipedia are founded around notions of sharing, equal access, and being a common good.

Given the concerns about big tech’s hoarding of personal data,Footnote 1 creation of profiles, mining of data, and extrapolation of new knowledge from their data collections, there is a need and interest in devising policies and regulations that better shape big tech’s influence on people and their lives. One proposal to counter big tech’s data power is to create data commons.

The basic idea behind data commons is that that data ought to belong to everyone, and the private data collections constructed by big tech, should be shared with everyone as a data common. Just like fish is a common good, data should also be a common good. The only problem, of course, is that data is not like fish—data concerns individual people and their lives.

Within Facebook’s mission, as expressed by Mark Zuckerberg, “sharing is represented as a mechanism for improving human relations and making the world a better place” (John, 2012, p. 177). Although sharing (of ideas, thoughts, stories) is fundamental for human relations, the idea that sharing in itself makes the world a better place—the ideology which data commons rest upon—is an ideology that does not take the human need for privacy, seclusion, and confidentiality into account. The ideology does not consider the role privacy plays in democratic societies—where privacy offers people a space for reflection, enabling the development of self, autonomy, and agency, i.e., to become an active citizen (Lever, 2016; Solove, 2008; Macnish, 2018). Furthermore, it is an ideology that in some sense resembles the idea that technology in and of itself will make the world a better place. The exact ideology that Winner critiqued heavily in the 1980’s under the headline Mythinformation (1986) in which he argued for a philosophy and understanding of technology based on the Wittgensteinian notion of “forms of life.”

According to Wittgenstein (1953), understanding of the world is grounded in language. It is through language use that meaning is created, developed, and negotiated. However, this creation is not entirely free in the sense that it is conducted within specific language games—i.e., structures that govern how we can use language and thus construct meaning. All this takes place within a form of life, a practice, which further structures the various language games we have available, “the speaking of language is part of an activity, or a form of life” (Wittgenstein, 1953, §23). Fundamental to the notion of form of life is not only that meaning is created, developed, and negotiated in usage within practices, and that the understanding of words and sentences can only be understood within such forms of life, but also the notion that agreements about what is true and false, what is right and wrong takes place within forms of life, “It is what human beings say that is true and false; and they agree in the language they use. That is not agreement in opinions but in form of life.” (Wittgenstein, 1953, §241).

According to Wittgenstein, the use of language and the way we understand and make sense of the world is firmly tied to everyday practices and experiences. Using language is like a game. To use the language, one must follow certain rules, some of which are made explicit, and others remain implicit; the rules are part of social and cultural context in which the language is part. Coeckelbergh (2017) uses Wittgenstein to develop what he calls “technology games”; he suggests that to understand technology, its potentials and consequences, technology must be understood within the context and situation in which it is employed. Like words, technology is a tool to be used in particular situations for specific purposes and like words, technology gains its meaning and value in context; we learn to use and to trust technology in the specific context, “Technology, like language, is something we use, it is linked to activities, it is done in a particular context, its use and know-how is learned by doing, and it is part of a form of life. It also relies on trust in know-how that is already there in our culture. This includes trust in technology.” (Coeckelbergh, 2017, p.27).

To understand the advantages and disadvantages, the possibilities and limitations of data commons, we propose an ethics of sharing that critically examines the underlying logics of data sharing. Sharing is not necessarily caring; using Wittgenstein, we will argue that this depends on the context, practice, and the specific usage of the data, as well as who is sharing the data. It is not a neutral, factual statement to claim that data ought to be viewed as a common good and shared; that is a moral statement and should be evaluated within an ethical framework. As such, a statement, a policy, or a conversation about sharing is about moral standpoints: “an ethical conversation is not like ‘I like ice-cream,’ ‘I don’t,’ where the difference doesn’t matter. It is like ‘do this,’ ‘don’t do this,’ where the difference is disagreement and does matter” (Blackburn, 2001, p. 25). In the same way, sharing by itself is neither a good nor bad thing.

Thus, the understanding of ethics of sharing, which we propose in this paper is grounded in Wittgensteinian language games and forms of life, as well as in MacIntyre’s (2007) argument for a practice-based virtue ethics. Thus, the Wittgensteinian and MacIntyrean approach is a pragmatic structuring scheme taking human relations, sociality, and context into account.

The paper is divided into five main sections dealing with commons, sharing, data, ethics, and privacy. Each section forms a smaller part of the overall argument bringing one relevant concept from a specific literature to the table. Together, these five sections analyze the main concepts, logics, and issues at play tying together different strands of research in order to offer an ethics of sharing—an ethical framework to critically examine the logics and evaluate the moral statements about data sharing. The ethics of sharing is developed continuously throughout the five sections and then restated in the conclusion tying all the knots.

2 Commons

“Commons” has been suggested as a framework to understand the exchanges of ideas, materials, and rights that goes beyond the limitations set within the frameworks of “property” and private property rights. The basic notions driving the notion of commons is that there are some resources in the world which all humans ought to enjoy access to and which no one ought to have ownership over or property rights to; the oceans being the prime real-life example of a resource that ought not be captured in property rights. It is agreed that everyone must have access to the oceans, to the fish in the oceans, and to swimming in the oceans. It would therefore not be useful to think of the right to access to the oceans, to the usage of oceans, or to fish in the oceans in terms of ownership and private property. No one owns the oceans. However, the oceans need protection. Thus, the oceans are understood as commons, and access to and usages of the oceans are controlled through a set of regulations that have been put in place via democratic and transparent procedures. It is this idea of how to control access and usage of a common resource that is now being proposed as a means to control access to and usage of immaterial things such as ideas, rights, and data, and what is referred to as “data commons.”

To understand the mechanism of commons, specific resources can be categorized along two classificatory principles (Benkler, 2006, p. 61): access and regulation. Access concerns whether the resource is open to anyone or only to a limited or specific group who has access to the specific resource. Everyone has access to open commons, such as “the oceans, the air, and the highway systems” (Benkler, 2006, p. 61), whereas only a specific group of people has access to limited-access common resources such as certain “traditional pasture arrangements in Swiss villages or irrigation regions in Spain” (Benkler, 2006, p. 61). The difference between the two being that limited-access common resources are more like private property resources to the rest of the world, except for the few people who has access to the resources. Open commons are equally accessible to everyone.

The other classificatory principle is whether usage of the resource is regulated or unregulated (Benkler, 2006, p. 61). Both social conventions and legal frameworks can regulate the commons and the use of the resource. While it is unregulated who can use the air to breathe (meaning that all humans can use the air to breathe), there are social and cultural norms for how to inhale and exhale in crowded places. Likewise, while everyone in the community may use the playground, there are both social norms and written guidelines that regulate that use. And while everyone are allowed to go fishing, they need a fishing license first and there are rules as to how many fish they can take home and when they can do it.

These four basic types of commons (Fig. 1) leave open for discussions and localized considerations how and when a particular resource belongs to one or another type. The categorization of a particular resource in the matrix depends on the degree of freedom to access the resource, and whether and how the usage of the resource is regulated. Software, for instance, can be (i) open and freely accessible to all, (ii) its usage can be regulated via licenses, (iii) it can be made freely available only to certain people, or (iv) available to a specific group under certain regulations (say terms of subscription). Software can of course also be entirely privately owned, and as such be private property.

Fig. 1
figure 1

Four types of commons in Benkler (2006), figure developed by the authors

Another dimension to the definition of resources that can be considered to identify the commons are the principles of excludability/non-excludability (similar to access) and rivalry/non-rivalry. Pure private goods are excludable and rivalrous, whereas pure public goods are “non-excludable (i.e., it is impossible to exclude those who did not pay for consumption from enjoying a public good) and non-rivalrous (one’s consumption does not limit consumption of others)” (Purtova, 2017, p. 182). The classic commons of air, oceans, fish, and greens, etc. lie somewhere in-between and are often defined as non-excludable and rivalrous—they are rivalrous because the goods in some ways are exhaustible and needs management and protection. If we catch too many fish today, there will be no fish tomorrow; if we emit too much carbon today, there will be no clean air tomorrow; where I stand on the mountain, you cannot stand; etc. However, we still share these goods and resources because we all have access to them—they are non-excludable.

The goods and resources in the commons are by default sharable; their very nature and definition are such that people share them and that more people can enjoy them simultaneously. When we enter the digital domain, sharing becomes even easier as the rivalry disappears. In the digital domain more people can use the same resources at the same time without sacrifice, yet these resources still need management in terms of curation, and sometimes protection. It may therefore seem obvious to conclude that, “data should be understood as a form of commons that requires protection and careful management” (Rahko & Craig, 2021, p. 195), especially if one considers data to be “cumulative” and “non-rivalrous” (Rahko & Craig, 2021, p. 195). In this understanding, data could and should be made accessible as a common pool resource because that would hinder the commodification of data and “overprotection of copyright and intellectual property” (Rahko & Craig, 2021, p. 197) and it is unproblematic to share goods that are non-excludable and non-rivalrous.

Central to the idea of “the commons” is the notion of “sharing.” Sharing takes place within all four types of commons in Benkler’s categorization; it is just a matter of who gets to share what (access) under which circumstances (regulation). The central principle guiding the commons is the sharing of goods and resources—material or non-material—in different ways. Sharing is an activity that comes with different logics that establish different practices and possibilities. As such sharing is not a specific kind of action, but an activity that takes place in different forms in different contexts; the special and unique circumstance of sharing in (data) commons, as we will argue in the next section, is that while sharing is typically an activity in which one shares with another something that one owns, sharing in (data) commons is an activity in which a second party shares something with a third party that is belonging to anyone or no one, it is a different mode of sharing.

3 Sharing

In The Social Logics of Sharing (2013), Nicholas A. John describes sharing as a metaphor and a key concept “emerging across a range of fields” (p. 114). The concept has different meanings and connotations in the different fields, but “if we wish to understand it in one sphere, we need to take other spheres into account as well” (John, 2013, p. 114), the reason being that the once fixed meanings are weaving into one another creating new logics when new contexts emerge. These logics are also, what could be referred to as practices (in the Wittgensteinian and MacIntyrean sense).

According to John (2012, 2013), there are three different meanings, modes, or definitions of sharing. These are not to be equated with the logics of sharing, as the logics are the combination of the different meanings of sharing in specific contexts. Thus, John (2013) analyzes how the different meanings of sharing play into one another in the contexts of (i) Web 2.0, (ii) “sharing economies,” and (iii) intimate personal relationships creating new logics of sharing.

3.1 Kinds of Sharing

The meanings of sharing falls within the two broad categories of sharing as distribution and sharing as communication. Sharing as distribution comes in two different varieties: (i) the classic act of dividing some tangible good into parts; and (ii) a more abstract meaning of having something—concrete or abstract—in common. When we share by dividing—e.g., sharing a chocolate bar by breaking it in half—we are engaging in a zero-sum game governed by cultural norms constitutive of social relations (John, 2013). By sharing my chocolate bar with you, I “lose” the half I distribute to you, and we create a social bond of some kind.

When considering sharing in the second meaning of distribution, as having something in common, sharing is concerned with sharing concrete or abstract things, not as a zero-sum game, but as something that “remains whole, despite being shared” (John, 2013, p. 115). Or as Wittel (2011) puts it “the sharing of immaterial things does not ‘reduce’ anything but adds value to whatever is being exchanged” (Wittel, 2011, p. 5). An example of a concrete object that could be shared is a dormitory room and examples of abstract objects are “interests, fate, beliefs, or culture” (John, 2013, p. 115). In the meaning of sharing as having something in common, sharing is not about creating social ties, as it is possible to share a belief or a culture without having any relation or bond to one another.

Sharing as communication comes in one variety focusing on the sharing of our feelings and emotions, that is, “imparting one’s inner states to others” (John, 2013, p. 115). According to John (2013) this kind of sharing is—like distribution by division—about creating and regulating social ties.

However, instead of distinguishing between sharing as distribution and sharing as communication, it is also possible to place emphasis on what kind of good, object, thing, etc. is being shared and the outcome of sharing the good, object, thing, etc. Thus, Wittel (2011) suggests a focus on the differences between sharing material and immaterial things,

“Whereas the sharing of material things produces the social (as a consequence), the sharing of immaterial things is social in the first place. Whether we share intellectual things such as thoughts, knowledge, information, ideas, and concepts or affective things such as feelings, memories, experiences, taste, and emotions, the practice of sharing is a social interaction. The sharing of immaterial things produces (as a consequence) other things than social relationships, such as knowledge, art, rules, and religion.” (p. 5)

Whether the sharing of emotions, feelings, knowledge, ideas, etc. are creating the social or are already social, the link between sharing and the social is tight. As John (2013) concludes,

“Sharing is associated with positive social ties. Sharing is always good, and when it is not, we refer to it as ‘sharing,’ where the quotation marks serve to position us at an ironic distance from the word enclosed in them (…), or we insist on the use of a different word.” (John, 2013, p. 127)

This association between sharing and something positive possibly stems from our current therapeutic culture where “sharing is a type of communication that implies equality, mutuality, and honesty” (John, 2013, p. 124)—i.e., the constitutive activity of intimate relationships. We are talking about a mode of communication “that resonates with an ethic of care, implying attentiveness, and concern” (p. 124). “Sharing is caring” as the catchphrase sounds turning into its own logic and ethics of sharing.

3.2 Sharing Economies

As a completely different logic of sharing we have the “sharing economies” (John, 2013) based on sharing as distribution. These economies are divided into economies of production and economies of consumption. Production is divided into goods shared by all (the commons) and the sharing of time, knowledge, and the like (i.e., something an agent has), whereas consumption is divided between the sharing of personal property and memberships of third party-owned sharing communities. “The paradigmatic example of a sharing economy of production is provided by Wikipedia” (John, 2013, p. 118) falling into the realm of the commons (Benkler, 2006) or the digital commons (Dulong de Rosnay & Stalder, 2020).

The commons are the result of a shared economy of production and, together with the goods shared by all, they are linked to the meaning of sharing as having something in common with someone (John, 2013), of having equal access both with respect to abstract and concrete objects. These commons, which are mostly digital or digitally supported, are not a zero-sum game, they remain whole when shared and even grow in size when people are sharing the effort of creating them (e.g., Wikipedia). In contrast, the classic commons of mountains, oceans, fish, and greens, etc. are rivalrous (i.e., zero-sum games) when the goods shared are exhaustible like fish. Thus, the sharing in the digital commons (data commons) are different from the sharing in the traditional commons—not necessarily in terms of the production (the efforts to produce can be shared much in the same way) but in terms of the sharing of the good produced or maintained in the commons.

The sharing economy of consumption is also of special interest to the commons. Here the sharing is of personal property (goods produced somewhere else or by the individual) and examples of goods being shared for consumption online are files, codes, photos, videos, and knowledge (John, 2013). This kind of sharing is not a zero-sum game as “The sharing of digital things is effortless, it does not involve any sacrifice. Digital things just get multiplied” (Wittel, 2011, p. 6).

Sharing of data could be considered as belonging to the category of sharing something for consumption of what is “owned,”Footnote 2 as data is related to codes, files, and the like. Or, it could be considered as sharing in terms of production as data is also related to information and knowledge. It seems to be a matter of the format and at what stage the sharing occurs (production or consumption). However, in the discussions of data sharing another logic seems to be at play as well, namely, that “sharing is caring.” Under the headline Data: Sharing is Caring, Levenstein and Lyle (2018) argue in favor of increased data sharing in research. In fact, they argue, data sharing should be mandatory, even when it comes to health data and other sensitive or personal data (with anonymization or special protection). The argument is that data sharing is always good and has many benefits including replicability of studies, new studies using the same data, etc., all leading to more and better knowledge (Levenstein & Lyle, 2018).

“Sharing is caring” also guides the practices in online domains such as social networking sites, where sharing of personal information, pictures, stories, links, etc. is the modus operandi. In Chelberg’s (2021) words,

“Sharing metaphors encourage user participation in online sharing practices and reinforce cooperative behavioral norms by association with positive social values. The metaphor ‘sharing is caring’ explicitly expresses this online sharing ethic to press participation in online cultures, and, at the same time, designates not sharing as ‘uncaring’.” (Chelberg, 2021, p. 64)

In the online domain, sharing has become synonymous with participation drawing on the therapeutic discourse defined by John (2013), manifesting the link between sharing and an ethics of care visible in the arguments by Levenstein and Lyle (2018). Chelberg (2021) further highlights the ethical dimension of sharing by stressing that sharing is both an act and the foundation for participation in the online domain,

“With sharing as its ‘core cultural value,’ sharing practices form an online ethical framework where sharing is a threshold activity to participation in digital cultures. In contemporary digital society, sharing is both the act and ethic of online participation.” (Chelberg, 2021, p. 64)

The ethics here are drawn from the ethics of care and then linked to sharing and online participation such that “The ethics of sharing in a digital society mean that ‘good subjects post, update, like, tweet, retweet, and most importantly, share’.” (Chelberg, 2021, p. 64).

It is this logic of sharing as caring, and the foundation for participation in the online domain, that tech companies and social networking sites have played an important part in establishing in order to exploit personal data for commercial gain (John, 2012, 2013; Wittel, 2011). Furthermore, it is these practices engaged in by the tech companies that “data commons” are suggested to counter or be an alternative to (cf. Prainsack, 2019; Purtova, 2017). A way of getting back the power, autonomy, and authority over data or information—personal, sensitive, or otherwise.

3.3 Sharing in the Commons

The logic behind data commons is also the equation between care and to share. If data is pooled together outside the scope of tech companies, data can be used for the good and for the better for and by everyone. Moreover, this underpins the logic of a sharing economy of production (where John, 2013 locates the commons)—at least with regard to the understanding of the type of sharing that takes place in relation to commons in the digital realm where we care by sharing the effort of producing the common—as well as the sharing of “things one owns” as a shared economy of consumption where we care by sharing our data using it for purposes benefitting us all.

However, the concept of data commons is defined from the outset of traditional physical commons where the mode of sharing is not necessarily one of production (although it can be when we are dealing with the maintenance of life stock, fish, crops, and other living beings or ecosystems). Thus, data commons are defined from two different and at times competing angles simultaneously. This ties in with the argument by Wittel (2011) regarding the differences between digital and pre-digital times,

“Whereas sharing in the pre-digital age was meant to produce social exchange, sharing in the digital age is about social exchange on the one hand and about distribution and dissemination on the other hand. What makes sharing with digital media so hard to understand is exactly this blurring of two rather different things.” (Wittel, 2011, p. 8)

This blurring is furthered by the “catch-all status of sharing in digital contexts as referring to transfer of data” (John, 2012, p. 170), obscuring the very different practices taking place in the digital realm often simultaneously. For instance, “references to the transfer of data about users to advertisers as ‘sharing information’ with third parties serve to mystify relationships that are in fact purely commercial.”Footnote 3 (John, 2012, p. 169).

These conceptualizations of sharing establish different meanings (as well as the different logics) of the notion “to share.” However, common for these conceptualizations is that people share something they themselves own or they have produced themselves, or they share their own feelings, private thoughts, etc. On the other hand, what is shared in the classic understanding of commons is owned by everybody or nobody.

With data commons and the sharing of data (including personal data) a different mode of sharing is at play. The data is collected from individuals by a second party, and then shared by that second party with third parties. Thus, it is not necessarily individuals themselves—from whom the data is collected—who also share the data. This gives us a completely different mode of sharing which we do not find in any of the abovementioned conceptualizations of sharing (neither in the meanings, nor in the logics of sharing)—this is a mode of sharing where someone shares what in one way or another belongs to somebody else. This mode of sharing is still sharing as distribution—the difference being who is doing the sharing.

If we consider this mode of sharing with the example of sharing a chocolate bar, it would amount to me collecting your chocolate bar and then sharing it with someone else. If I do it without consent, we would normally consider the collection of the chocolate bar to be stealing. However, my sharing of your chocolate bar afterwards could still be considered sharing, although I am the one deciding with whom to share your chocolate bar. Alternatively, if I collect your chocolate bar with consent, it would not be considered stealing and you might have a chance to influence with whom I can share the chocolate bar, but it is still me who is doing the actual sharing, and thereby ultimately deciding with whom to share the chocolate bar.

As discussed above, sharing as distribution of material goods is a zero-sum game; some of the chocolate bar is lost when shared. The sharing of data, on the other hand, is not a zero-sum game. When shared data is not lost or reduced; you and I can have the same data at the same time, and if I share the data with a third party, you will still have the data. However, if I collect your personal data and then share it with a third party, challenges arise—not in terms of rivalry, but in terms of ethics. When I share your data with a third party, is sharing then caring? Under which circumstances and in which context would it be right to share others’ data with someone else? Additionally, and to complicate matters even further, when sharing personal data, we do not merely share data about ourselves, often the shared data is also about others, either directly or indirectly (cf. Søe et al., 2021).

In a datafied world, where most personal data is collected automatically, lots of data sharing take place mostly between third-parties, and not only between the data subjects and second-parties. Thus, although it is sharing by distribution of abstract non-rivalrous things, the logic and context of sharing is different in terms of who are the ones sharing.

4 Data

The challenge with data commons is that they re-enforce the very problem they are envisioned to address. While data commons are imagined to counter or solve the challenge of hoarding of (personal) data and making the benefits and values of the data accessible to more people (cf. Prainsack, 2019; Purtova, 2017), data commons re-enforce that challenge by sharing data, which in the first place should have remained private.

As such, data and information are fundamentally different from fish, mountains, scientific knowledge, and the other goods and resources originally considered to belong in commons. Data and information derived from online or digital participation is different from knowledge in digital commons, such as Wikipedia. The significant distinction lies in the nature of the object shared in the commons: In Wikipedia, and other digital commons, the object is the content (i.e., the pages written, the music or movies stored, shared and produced) and the knowledge that is shared. In data commons, the object is the patterns of people’s behavior and preferences (as they search, click, like, share, watch, listen), their health profiles based on medical information, bodily monitoring, etc.—in other words, the object is the “behavioral surplus” (Zuboff, 2019, p. 11) that is stored in data commons, and potentially shared. The problem is not only that big tech companies gain financially from their data collections, but the problem is also that we cannot in any way foresee how these collections of data have implications for us—what they can tell about us, and how that knowledge can be used. This last part of the problem is what is re-enforced by the data commons. Collecting data in a common does not in itself help us to anticipate what can be derived from the data and how it can be used.

Sharing of data could be conceptualized both in terms of sharing as communication as well as sharing as distribution to use John’s (2013) terminology. In the case of data commons, sharing as distribution is the underlying idea and if we think of the sharing of data as sharing of “behavioral surplus” (i.e., the patterns inferred from the data), it is however more equivalent to sharing of emotions, feelings, thoughts, even secrets, and the like—i.e., sharing as communication. This is because these patterns reveal things so deeply personal which we might not even be aware of ourselves and have no way of anticipating.

If we consider sharing of data as something akin to the sharing of secrets, it also becomes apparent why the sharing between third-parties is different from when we share our own data. If you share my secret with others (even though I have shared it willingly with you) this sharing is very different from when I shared my secret with you—both in a practical and a moral sense. Although a secret is an abstract and immaterial thing, which therefore would make it non-rivalrous and perhaps even non-excludable secrets are no longer secrets when shared; secrets lose their status as secrets when a certain number of people knows the secret.

In this respect, personal data ought to be considered more as secrets (with a sharing mode of communication) than as codes, files, or knowledge (with a sharing mode of distribution). Personal data is therefore not like fish and, as argued by Prainsack (2019), “the assumption that digital data and information are (or can be treated analogous to) a physical common-pool resource and be governed by established principles designed for commons does not stand up to scrutiny.” (p. 7).

However, the logic behind data commons seems to be the following: the sharing of knowledge is good, thus the sharing of research literature and other literature is good, thus the sharing of research data is good, thus the sharing of other kinds of data is good, thus the sharing of data from big tech is good (and certainly better than the tech-giants keeping it for themselves), thus data commons are good—all because “sharing is caring.”

The logic also applies to cultural and esthetic products like music, movies, pictures, and consumer goods to the sharing of data in data commons. While this logic may be correct with regards to many types of goods and data—knowledge, literature, and research (or music, movies, and consumer goods)—it is a fallacy to include personal data in this logic. As argued above, the shared objects are different in nature: with knowledge, research, literature, music, movies, and consumer goods the shared object is the actual content. With personal data the shared object is behavior, preferences and views into the future (prediction). Thus, the logic sketched above falls apart when research data and data owned by tech-giants enter the picture, because here, we are often dealing with personal data in some way or other (dependent on where we draw the line between personal data and non-personal data, if that line can be drawn at all, cf. Purtova, 2017, 2018; Søe et al., 2021).

The challenges of the data commons arise from the logic that if sharing of some data is good, then sharing of more data must be even better; sharing is caring, regardless. But given that all data and information in data transfers to third parties and in data commons is personal data or information (cf. Purtova, 2018; Søe et al., 2021), because all data and information is data derived from or produced by or about someone, sharing is not necessarily good. Furthermore, any data or information can be sensitive and revealing when datafied and analyzed for specific purposes within a given context (Mai, 2016, 2019; Søe, 2021). In this sense, the specific storage of the data or information, in private companies or data commons, is not relevant. Furthermore, “the conflation of commons and ‘open access regimes’—namely resources that are not owned by anyone—obscure the fact that the very possibility to govern a resource in a fair and equitable way requires that someone owns it” (Prainsack, 2019, p. 3). This means, that if we want to establish data commons, then we would need to establish data ownership as well, transferring data rights to those governing the commons. This is part of what we see with data trusts where a third-party is governing the use of the data on behalf of those who have contributed the data. However, this does not make up for the problem of aggregation, profiling, and inference of new data—behavioral surplus—and the impossibility of anticipating the possible implications for the data subjects, when this data is put to use.

5 Ethics

With the understanding of data outlined above, we wish to question the dominating conception of sharing as caring, as something that by default is good, and something that in and by itself has positive connotations, as exemplified by John (2012):

“Sharing, then, is a concept that incorporates a wide range of distributive and communicative practices, while also carrying a set of positive connotations to do with our relations with others and a more just allocation of resources.” (p. 176)

We wish to argue that sharing takes place in practices and that actors, and especially professionals, in the practice take responsibilities for the design, regulation, and conceptualization of data sharing, storage, and analysis. The responsibility for any single event may not lie with individual people, but is shared among those involved:

“Not so long ago the huge personal data capture and exploitation affair surrounding Facebook and Cambridge Analytica was in the news ... Responsibility should be shared among those involved, including applied mathematicians, information and communication technology professionals, computing specialists and authors of algorithms and Apps, as well as those who fund, instigate and control these applications across society at great social costs, for political and economic gain.” (Ernest, 2021, p. 3161)

This responsibility comes at three levels: (i) being a good, virtuous professional; (ii) taking social responsibility for the practice; and (iii) contributing to the development of the social and moral tradition of the practice.

While much conversation about data ethics and computer ethics has a focus on the potential harms of certain data practices and consequences of computational processes and procedures—both within specific organizations and in society as a whole—the conversation is often directed towards development of specific guidelines of what to do and what not to do to help policy making and technology designers to do the right thing (cf. van Maanen, 2022). However,

“‘Do the right thing’ is a moral principle we all believe in, which admits no exceptions. We should always do what is right. However, this rule is so formal that it is trivial—we believe it because it doesn’t really say anything.” (Rachels & Rachels, 2012, p. 133)

Thus, instead of general guidelines for data sharing, we propose to conceptualize data commons as practices and forms of life establishing the ethical and moral principles that govern the practice (or in Wittgenstein’s terms: form of life).

Sharing and data commons are social enactments of collective, shared, goal-directed activities that rely on active members of the community. Members of the practice depend on each other and are only individually successful, if the practice as a whole is successful. Practices are therefore not merely a group of people’s individual habits and activities, but is better conceptualized as the “collective accomplishment” (Barnes, 2005, p. 31) of the group. To consider what is good for a group, we need to override the individual level of activities and instead focus on the coordination and social power of the group. Or, as Wittgenstein phrased it, practice is about being a part of the same form of life and language game, because “speaking a language is part of an activity, or of a form of life” (Wittgenstein, 1953, §23). Wittgenstein gives a range of examples of what he thinks of as forms of life, including: giving orders and obeying them, describing the appearance of an object, constructing an object, reporting an event, speculating about an event, forming and testing a hypothesis, making up a story, play-acting, making a joke, solving a problem, etc. These are examples which require cooperation and the shared establishment of understanding, meaning, rules and value.

We suggest that (data) commons and the activities of sharing are best understood as forms of life, social activities, or practices where members of those communities are regarded as “profoundly interdependent, mutually susceptible social agents” (Barnes, 2005, p. 34). This is related to MacIntyre’s notion of practice:

“By practice I am going to mean any coherent and complex form of socially established cooperative human activity through which goods internal to that form of activity are realized.” (MacIntyre, 2007, p. 187)

In this understanding, a practice is when humans collaborate to achieve something beyond their own personal goals and ambitions—it is something that can only be achieved when people interact within a system to reach a common, shared goal.

However, the fact that people collaborate on a task does not necessarily make that activity a practice—that task needs to be embedded in a greater collective whole; “tic-tac-toe,” “throwing a football with skill,” “bricklaying,” and “planting tulips” (MacIntyre, 2007, p. 187) are not examples of practices. Whereas a “game of football,” “chess,” “architecture,” “farming,” “enquiries of physics, chemistry and biology,” “work of historian,” “painting,” and “music” are examples of practices (MacIntyre, 2007, p. 187). Common for these examples of practices is that there is a “certain kind of relationship between those who participate” in the practice (MacIntyre, 2007, p. 191). Thus, the creation and maintaining of data commons is also a practice—it is not just a collection of data under specific regulation.

A common feature of practices is that participants “have to accept as necessary components of any practice with internal goods and standards of excellence the virtues of justice, courage, and honesty” (MacIntyre, 2007, 191). As such, ethics (to determine what is good, what is the right thing to do) is embedded into the particular social and cultural practice and decided collectively within the practice. How each individual in the practice ought to act, is determined by the internal goods, the ethical standards, of that practice. Because that practice comes with a set of “standards of excellence and obedience to rules as well as the achievement of goods” (MacIntyre, 2007, p. 190) and therefore, when one enters and become part of a practice, one has to “accept the authority of those standards and the inadequacy of my own performance as judged by them” (MacIntyre, 2007, p. 190).

Similarly, just as one has to accept the rules, language, and standards of the practice, when one enters the practice, one’s actions can also only be judged ethically within that practice, because, “without these social contexts there is no ethical system” and the “overarching moral and social tradition” of the practice will “develop and grow in a reflexive and self-sustaining way” (Ernest, 2021, p. 3140). This means that one’s actions will be judged ethically within the practice and against the standards in the practices, and not against a set of independent and external principles or system of ideas.

As such, ethics of sharing in data commons will need to be considered with the particular practice and considered up against the internal goods of that practice.

6 Privacy

The argument concerning the ethics of sharing could end with the establishment of data commons as their own practices and forms of life—cf. the MacIntyrean and Wittgensteinian frameworks presented above—with their own internal goods including a more nuanced notion of personal data and the implications of sharing such data. However, as always when the subject matter is personal data, questions of privacy present themselves as inevitable concerned with both moral and law. As such, data commons, and thereby data sharing, have implications for our possibilities to negotiate and enjoy privacy.

Fundamental to people’s privacy is, as Busch argues, that we acknowledge “our unknown and unseen selves, and offering these up only when and if we choose, [as it] is essential to our ability to engage in close relationships” (Busch, 2019, p. 29). As such, we ask the question whether personal data included in data commons (e.g., health data, behavioral data) reflect our otherwise unknown and unseen selves and thereby breaches our privacy. Thus, sometimes, non-sharing is caring—especially when we are dealing with personal data.

Further, to complicate matters, personal data “is interrelated, and people’s decisions about their own data have implications for the privacy of other people” (Solove, 2022-draft, p. 2). In other words, data in data commons has implications for many people in many different ways: it has implications for those people the data is directly linked to (i.e., those who it is absolutely about) and it has implications for those people it indirectly links to (i.e., those who it is relatively about) (cf. Søe et al., 2021). These implications might be in terms of privacy harms, inferences, and production of new information and knowledge; that is whether it is good for the individual, the collective, or society. Moreover, sharing of data has implications for those people who in some way are “acted upon” based on the inferences, and the information and knowledge produced. Whether these implications are good or bad depends on what kind of practice the data commons establish. By whom and for what purpose is the data, the information, and the knowledge ultimately used?

That personal data is almost always interrelated (Solove, 2022-draft) or relatively about others than whom it is absolutely about (Søe et al., 2021) could point to data commons as a better and more democratic use of the data that is already being collected about us. If your data is also about me, and my data is about someone else, then why not get the best out of it, and pool all the data in a common for collective, innovative, constructive, and good use?! Following MacIntyre (2007), what is good is determined by the specific practice in question. Thus, one way to define an ethics of sharing would be to govern the specific practices in terms of what practices are legitimate and what practices are illegitimate in connection to data commons.

However, the question of privacy seems to cut across all practices questioning the collection of personal data in the first place. According to Solove (2022-draft) “effective privacy protection” is about “bringing the collection, processing, and transfer of personal data under control” (p. 2). Zuboff (2019) takes this argument a step further, suggesting that “laws that pertain to ‘data ownership’ or ‘data protection’ overlook what is original in this latest ‘original sin,’ namely, the assertion that human experience is free for unilateral (and secret) rendition into behavioral data in the first place” (Zuboff, 2019, pp. 31–32). While Zuboff (2019) is pointing to big tech, her argument also pertains to repositories or data commons.

In the end it is a question about what kind of society we want. That data is gathered in a data common does not necessarily make the use of that data less harmful, that depends on what the data is used for and the internal goods of that particular practice. But one thing is certain, privacy is challenged independently of whether the data is in a common or in a private repository—it is only a matter of by whom and how the data is accessible.

Therefore, ethics of sharing ought to consider the right to be unseen and invisible when one wants to, to be left alone, to disappear in the pile of data and not be seen, to retract from the processes of data analytics and endless inferences, and to not be acted upon based on data traces. As Busch notes in “How to Disappear. Notes on Invisibility in a Time of Transparency” (2019),

“In the woods no more than an hour, I am struck anew by invisibility, and its improvisational choreography, as a necessary condition of life. I am reminded of the grace of reticence, the power of discretion, and the possibility of being utterly private and autonomous yet deeply aware of and receptive to the world.” (Busch, 2019, p. 4)

Thus, being invisible and private is not the same as not caring and not participating in the world—at times it is what enables participation and it is what enables the development of autonomy. It is another way to take in the world. This is in stark contrast to the “sharing is caring” logic.

When sharing is equated with a care ethics and therefore experienced and framed as something good by default, then hiding, holding back, not sharing become something negative. “Hiding and being hidden are routinely viewed in relation to disrespect, bias, prejudice, shame, and failure” (Busch, 2019, p. 10). However, fundamental to privacy is the notion that one has a right and a need to remain hidden, to not share, to not be found, to remain anonymous. This is central to the classic understanding of privacy—starting with Warren and Brandeis’s (1890/2005) definition of privacy as the right to be left alone—in which privacy is the right to prevent others from gaining access to information about you, from getting too close to you, from accessing your home, and from obtaining knowledge about you. The classic understanding of privacy is also the foundation for informational privacy: from Westin’s (1967) definition, “Privacy is the claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to others” (p. 7) to data protection frameworks, such as the General Data Protection Regulation (2022).

Foundational to ethics of sharing needs to be a concern and respect for people’s right to privacy. The collection, processing, storage, and sharing of data—even in data commons—might infringe data subjects’ privacy. The more challenging part of this may be to consider the particular practice in relation to which understanding of privacy and which understanding of personal information is at play. One could regard privacy and personal information in a narrow sense to only being about people’s most sensitive data or one could take a more broad perspective and regard matters of privacy and personal information to be about people’s right to form their own identity (Søe & Mai, 2022).

Another aspect of ethics of sharing would be to consider situations where the good thing to actively choose would be not to share information. Anita Allen (2011) suggests that there are situations in which the notion of privacy should be pro-actively enforced paternalistically. She is concerned that people may too readily offer personal information about themselves—or, we may add, about others either directly or indirectly—which may harm them at a later point. In these situations, the good thing to do might be to prevent the sharing of information. This argument could be viewed as controversial in its paternalistic manner as autonomy and paternalism do not match well (cf. Moore, 2013; Roessler, 2013). However, there might be something right and crucial in the question of whether there are types of information that should never be shared: certain types of information we are ethically bound not to share because sharing will have negative implications for ourselves, others, and society. Again, sometimes non-sharing is caring.

It might be true that the sharing of personal information, emotions, etc. are the constitutive force of intimate relationships, but so is privacy, and maybe we ought not have intimate relationships with everybody and anybody. As Busch (2019) states “this may be the moment to reclaim invisibility’s authority, rescue its greater meaning, and reconsider it as a positive condition of human experience” (p. 12).

7 Conclusion

As argued in this paper the logic of “sharing is caring” is flawed in itself. Sometimes sharing is not caring. In The Circle sharing becomes devastating, depriving people of their humanity. The concept of privacy likewise concerns individuals’ ability to influence the construction of their own identity (Søe & Mai, 2022) and is as such a right to protect human subjectivity (Hildebrandt, 2019). However, while privacy is often conceptualized as an individual right to control one’s personal information (Westin, 1967)—and problematically so (cf. e.g., Solove, 2008)—privacy is, at its basics an intrinsic and invaluable part of what enable us to develop as autonomous beings in democratic societies (Lever, 2016; Solove, 2008; Cohen, 2013).

Data commons are proposed to counter big tech in their hoarding of data from people. However, data commons might inherit some of the same problems and issues that data repositories in big tech face with regards to privacy. The idea and practices of commons are noble and democratic, but they are altered in fundamental ways when it is data that are in the commons instead of oceans, knowledge, music, and the like. The data is often derived from people in different ways and are thus related to people making it personal data. Such data is significantly different in nature than fish, mountains, or software. Thus, the logic of “sharing is caring” currently underlying the idea of data commons is flawed questioning the idea of data commons as well as the collection of personal data and information.

We propose a new ethics of sharing questioning the default link between “share” and “care.” This ethic is based on MacIntyrean practices and Wittgensteinian forms of life with an emphasis on privacy issues arising in relation to the interrelated nature of personal data and information. Data is not like fish—it is about people—and the sharing in data commons is of a different type, than when we share our data ourselves. In data commons, as their own form of life and practice, sharing is sharing between third parties with unforeseen and unknowable consequences for individuals.