Detaching data from the state: Biobanking and building Big Data in Sweden
- 142 Downloads
LifeGene, a biobank and research infrastructure, is Sweden’s largest biomedical project. Designed for research on gene–environment interactions, the project aimed to collect data and biological samples from 500 000 individuals. The directors pointed to Sweden’s universal health-care system, national registries and pro-science citizenry as indicative of the nation’s unique suitability for this ambitious project. As researchers explained, in Sweden, large-scale national collection of personal data has generally proceeded with little debate. In this view, the historical legacy of social engineering and close ties between science and the state has led to a popular sense of trust in the state to collect and use information in the best interest of the population. However, LifeGene is more than just a continuation of information-gathering-as-usual in a country where the government has long kept track of its population’s health and social characteristics. With LifeGene, the construction of surrounding research infrastructures, and a reworking of national data protection legislation, Swedish researchers and authorities are now purposely building a framework for moving from data-as-usual to Big Data and the Big Value it promises to deliver. Drawing on ethnographic fieldwork with Swedish researchers and data managers, this article will examine the legal, social and infrastructural challenges of Sweden’s bid for Big Data.
KeywordsBig Data Sweden biobanks medical research infrastructure
LifeGene, Sweden’s most ambitious national health and biobank project, began as a pilot study in 2009, and was launched in 2010. Designed to facilitate large-scale prospective epidemiological research by collecting high-quality lifestyle and genomic data, LifeGene was coordinated by the Karolinska Institute. The goal of the project, funded through a public–private partnership, was to enroll 500 000 individuals across Sweden, who would be given a comprehensive web-based questionnaire, and from whom biosamples and physical measurements would be taken.1 The questionnaire, measurements and samples would be repeated at 5-year intervals for at least the next 20 years. Participants in the project were asked to give consent to LifeGene for the processing of personal data collected specifically for the study, the storage of their blood and urine samples in a biobank, and access to personal data from medical records and health-relevant data in Sweden’s national registries (Maeurer, 2010; Almqvist et al, 2011; Regeringens Proposition, 2013).
LifeGene was part of a larger Swedish initiative creating comprehensive research infrastructures to enable the collection and storage of data on an unprecedented scale, and developing technological platforms for analyzing these data sets to discover patterns and associations. Thus, I argue, LifeGene can be seen as part of a national push toward Big Data in biomedical research and across multiple domains of health and social welfare (Regeringens Proposition, 2008; SRC, 2012; SRC, 2014). For purposes of this article, I define Big Data as the analysis of data contained in very large – and often mixed – databases. Big Data projects, including LifeGene, require the development of elaborate, novel infrastructures – in particular when researchers want to access or combine data across research domains, jurisdictions, and, potentially, national borders.
In 2009–2010, when the LifeGene project was introduced, and when I conducted fieldwork, the term Big Data was not widely used in Sweden. References to Big Data (the English term has been adopted in Swedish usage) can be found in research reports beginning around 2012 (for example, Görnerup et al, 2012) and in the popular press around 2013, following the revelations of NSA surveillance in the United States and Sweden’s FRA’s (National Radio Defense Establishment) cooperation (Dagens Nyheter, 2013; Kielos, 2013; Rebas, 2013). As Big Data entered the popular lexicon in Sweden in 2013 and 2014, the term began to be used to describe LifeGene and other large-scale research databases (for example, Lagerwall, 2013; Jonsson, 2014). Here, I will trace the development of LifeGene, the construction of surrounding research infrastructures, and the reworking of national data protection legislation over the period from 2009 to 2014 as an example of how Swedish researchers and authorities are actively working to move from data-as-usual, as it has been understood in the Swedish context, to Big Data and the Big Value – the seemingly limitless knowledge and economic growth – it promises to deliver.
In Sweden, as elsewhere, the rhetoric of Big Data draws on a sense of its technological novelty and its perceived ability to produce novel forms of social and scientific knowledge and economic value. Big Data is also often characterized as perpetually in flux: accumulating, growing and moving beyond existing capacities for containing and processing data (Mayer-Schönberger and Cukier, 2013). These ways of imagining Big Data as both mobile and anticipatory came into play in Swedish discussions of the LifeGene project, and, I will argue, helped to produce a crisis of legitimacy that led to the temporary shutdown of LifeGene in 2011 by the Swedish Data Inspection Authority (DIA). In 2013, a new law was passed to allow LifeGene’s work to continue, resolving the 2-year period of legal uncertainty. The DIA’s 2011 decision and the subsequent debates revealed how the detachment of health data collection and storage from existing social welfare state infrastructures and oversight mechanisms emerged as an obstacle for LifeGene, raising the question of whether the possibility of participating in Big Data can be reconciled with national data as it has been framed in the context of the Swedish social welfare state.
Paradoxically, LifeGene aimed to build on qualities imagined as uniquely Swedish while standardizing data protocols to facilitate seamless linkage with research data worldwide and creating a large-scale database that stands apart from Sweden’s traditional state-managed and nationally bounded model of data collection. Thus, I will argue, LifeGene is emblematic of a larger tension in Swedish biomedical research in which the particular history of the Swedish welfare state and the related longstanding and widespread usage of personnummer (personal identification numbers) has produced an unusually rich trove of national data, but at the same time, the value of Swedish data increasingly depends on its use for producing knowledge about human health and behavior more generally, indeed, through what is identified as global data. This longstanding tension has been amplified as Sweden, like many other nations worldwide, works to build not only the technological infrastructure for Big Data, but also to adapt governance structures to the social and legal challenges of collecting and using data in new ways. Sweden, which has long been known for its detailed national demographic and medical databases and its progressive social welfare state, offers an especially interesting context for exploring the rescaling of ethical concerns in light of profound changes in data technologies and practices.
In this article, I draw on ethnographic fieldwork conducted in Stockholm, Sweden with researchers and database managers over 15 months between 2009 and 2010, including 30 tape-recorded, semi-structured interviews. This ethnographic research focused on researchers from a range of disciplinary backgrounds, all of whom studied on genetic and environmental influences on behavior and used data from the Swedish Twin Registry, housed at the Department of Medical Epidemiology and Biostatistics at the Karolinska Institute. The LifeGene project includes an oversampling of data from twins in the Swedish Twin Registry, and its development was described to me by researchers affiliated with both projects as one of the priorities of the Swedish Twin Registry. In addition to researchers affiliated with Karolinska, I also interviewed researchers at Stockholm University, the Swedish Institute of Financial Research, the Stockholm School of Economics, and Uppsala University, and database managers at Statistics Sweden. I conducted participant-observation at lectures, seminars, dissertation defenses, conferences, meetings and in social gatherings with informants. Tape-recorded interviews were conducted in English, while participant-observation took place in both Swedish and English.
I also draw on archival research and textual analysis of Swedish newspaper and popular scientific media reports about LifeGene and Big Data; academic publications by researchers affiliated with LifeGene; Swedish legislation concerning personal data, biobanks, and science and research policy; Swedish government reports; policy documents and reports published by public authorities including the Swedish Research Council (SRC), the DIA, and the Swedish Initiative for Research on Microdata in the Social and Medical Sciences (SIMSAM); newsletters from the Biobanking and Molecular Resource Infrastructure of Sweden (BBMRI.se); and the website, brochures and other public documents released by LifeGene and the DIA. I began collecting these documents during fieldwork, and continued to follow subsequent developments by gathering documents online. I engage with these documents as “paradigmatic artifacts of modern knowledge practices” (Riles, 2006, p. 2), which both produce and reveal the controversies and formations of consensus that they delineate.
LifeGene as a Unique Swedish Resource
LifeGene, however ambitious, is by no means the first large-scale population biobank worldwide. Swedish researchers have observed the development of similar projects such as those in the United Kingdom (Hoeyer and Tutton, 2005; Petersen, 2005; Allen et al, 2012; Collins, 2012), Iceland (Pálsson and Rabinow, 2001, 1999; Fortun, 2008) and Japan (Triendl, 2003; Kuo, 2011), among others. In recent years, biobanking has emerged as a specialty of the Nordic countries (Finland, Sweden, Norway, Denmark and Iceland), whose aggregated biological samples are estimated to encompass “one fourth of the world’s collective biobank capital” (Regeringens Proposition, 2008, p. 39, my translation).
LifeGene was not the first Swedish biobank – by 2012, Sweden had more than 600 registered biobanks. Of these, 325 biobanks, with 150 million collective samples, were managed by regional councils for health-care purposes. The other biobanks were managed by universities, companies and administrative authorities (Eaker, 2012, p. 8). Northern Sweden is home to the Medical Biobank, which only a decade ago was considered one of the world’s largest research biobanks, with genetic and medical information from over 70 000 inhabitants of Västerbotten country. The Medical Biobank is perhaps best known for its relationship with the private company UmanGenomics, which was given a monopoly on commercialization of samples (Austin et al, 2003; Hoeyer, 2006, 2005, 2003). Yet, LifeGene was seen as offering a unique and vital resource in Sweden and internationally.
Sweden has several crucial prerequisites for comprehensive longitudinal biomedical research, such as the personal identity number, the universally available national health care system, continuously updated population and health registries and a scientifically motivated population. LifeGene builds on these strengths to bridge the gap between basic research and clinical applications with particular attention to populations, through a unique design in a research-friendly setting.
(Almqvist et al, 2011, p. 67, emphasis added)
LifeGene is portrayed as benefitting from Swedish governmental and institutional structures, but also from an intangible, though valuable quality of the Swedish population – a collective orientation toward scientific research. Scientific motivation is described as an intrinsic quality of the population, a kind of national ethos linking participation in and support of scientific research with what it means to be Swedish. A scientifically motivated population might suggest higher rates of participation in a voluntary research project and political support for LifeGene’s proposed large-scale collection and storage of personal data, which might be considered controversial in another national context. However, like the universal health-care system, the classification of personal identification numbers and the population registers, motivation to participate in and support scientific research has emerged from a close historical connection between national data collection and social welfare policy in Sweden.
Records, Numbers and Populations
In Sweden, as throughout Northern Europe, there is a long tradition of detailed recordkeeping. In the seventeenth century, church officials carefully maintained records of parishioners; by the mid-eighteenth century, a central tabulation office was established to collect information about occupation, migration, births, deaths and marriages (Kälvemark, 1977; Sköld, 2004; Axelsson, 2010). As a result, many of the contemporary registers managed by Statistics Sweden, the national statistical office, have records that date back to the eighteenth century (Axelsson and Schroeder, 2009).
In addition to their historical depth, national databases store an astonishing range of information on health and behavior. Since 1947, all Swedish citizens and immigrants have been assigned personnummer, or personal identification numbers, and must use these numbers in interactions with public institutions and in many commercial transactions, whether checking out library books, visiting a doctor, opening a bank account, picking up prescriptions or paying bills. As an American citizen temporarily residing in Sweden during fieldwork, I was often asked for my personnummer, and on several occasions, store clerks or librarians were astonished by the idea of a person living without a personnummer. As a result of the ubiquitous use of the personnummer, data collected by the government and stored in databases can easily be cross-linked with information in other Swedish databases (Ludvigsson et al, 2009). Researchers, after receiving ethical approval, can apply to Statistics Sweden or other agencies such as the National Board of Health and Welfare to gain access to government register data as needed for a particular project, and the data are then made available to the researcher in a de-identified form.
The other Nordic countries also have similar personal identification number systems and population-based databases, which facilitate biomedical and behavioral research in much the same way (Watson, 2010; Hovde Lyngstad and Skardhamar, 2011; Pukkala, 2011; Bauer, 2014; Langhoff-Roos et al, 2014). However, Sweden’s population of around 9.5 million is by far the largest of the Nordic countries; Denmark, the second largest, has a population of about 5.6 million, while Iceland, the smallest, has fewer than 400 000 inhabitants. For research that requires large sample sizes, the combination of Sweden’s relatively large population with its extensive data collection practices makes it stand out, even in the Nordic context.
While all the Nordic countries have biobanks, each national population is described in a slightly different way. For example, in the 1990s and early 2000s, deCODE and Iceland’s Health Sector Database promoted the advantages of research on what they described as the small, genetically homogenous Icelandic population, where geographic isolation promised a “little changed gene pool” (Fortun, 2008, pp. 2–4). Thus Icelandic biovalue is imagined to lie in its small, genetically unique population (Waldby, 2002; c.f., Rajan, 2006; Waldby and Mitchell, 2006; Mitchell and Waldby, 2010), while the Swedish population is imagined as uniquely generalizable – like everyone else, but with better records, more data and greater scientific motivation. As ethnologists Billy Ehn, Jonas Frykman, and Orvar Löfgren have observed, one of the most prominent Swedish cultural narratives is, ironically, the self-representation of Sweden as not having any particular culture (1993). This logic, not uncoincidentally, also characterizes a strong universalist ideology of scientific rationality. The sciences, too, are often imagined as “cultures of no culture” (Traweek, 1992; Franklin, 1995).
Trusting the State
When I asked Evald, an epidemiologist at the Karolinska Institute, if there was anything specific about the Swedish population that he needed to account for in his research, he explained that there was nothing special about “the Swedish population per se, but our history in terms of introducing registries, and having a strong state, and individuals trusting the state and letting us register them, and so on. That has been an advantage over the long term, with the political system we have had where the state takes care of people.”2
This political system, characterized by an intimate “science-state nexus” (Asdal and Gradmann, 2014), has been associated with the Scandinavian region more broadly, where “particular emphasis has been placed on […] the use, utility, and applicability of science in society” (Asdal and Gradmann, 2014, p. 178). In Sweden, these close linkages between science, society and the state have contributed to the moral valences of scientific practice and a widespread trust in the government to steer policy rationally and in the best interest of the population (Rothstein, 2006). This particular national history thus has far deeper roots than the present interest – we might even say obsession – with collecting large quantities of data and correlational analysis of vast amounts of information that might lead to innovative scientific findings.
Other researchers, such as Alf, a psychology professor working for the Swedish Prison Service, characterized trust in science as part of the history of the social welfare state: “In Sweden, I think our self-perception is that we have a strong tradition from the 1940s and 50s with social engineering, and ideas about how scientific knowledge should sort of engineer society.” Sweden’s national interest in deriving social policy from scientific analysis was an explicit concern of the political leaders, or the social engineers, of the mid-twentieth century welfare state. Inspired by utopian thought and scientific socialism, they set out to transform the nation’s future through the rationalization of social institutions, services and policies (Eyerman, 1985).
In order to produce the right kind of citizens to inhabit modern Swedish society, the social engineers put scientific rationality to work in the service of social morality, confident in expert analysis to diagnose society’s evils and for technically informed social policy to “set things right” (Myrdal and Myrdal, 1934; Hirdman, 1989; Sejersted, 2011). The technocratic character of the era’s social reforms was deeply connected with the utopian dream of the good society. Yet, as Alf also pointed out, Sweden’s legacy of empowering experts has a darker side, particularly in biomedicine: “I think we have a strong tradition of paternalism, if you look at medicine, for instance, or if you look at coercion in psychiatry. I think we have a strong tradition of putting the – you know, what is collectively most important. Swedes are pretty much, I think, team players, so we decide what is best for the group, and then the individual will have to adjust to that.”
Evald’s description of trusting the state and Alf’s suggestion that “we decide what is best for the group” draw on an understanding that Sweden’s far-reaching and longstanding data collection practices are legitimized by their use to steer policy toward the common interest of the collective. Thus, motivation to participate in research and allow data collection is connected to trust in the state in that scientific research is understood to help the government and society, while data-driven scientific analysis ensures that the population as a whole, rather than special interest groups, benefits from social policy.
Sweden’s reliance on population data also resonates with a more widespread politics of numbers in liberal democracies, where political decisions have increasingly been based on numbers, which seem to “promise a ‘de-politicization’ of politics” (Rose, 1991, p. 674). As a political technology, quantification offers “a way of making decisions without seeming to decide” (Porter, 1996, p. 8), thus disambiguating moral problems by providing seemingly objective solutions (Shapin et al, 1985; Daston and Galison, 2007). In Sweden, this trust in numbers is reflected in the history of social engineering and relative lack of ambivalence about the state’s efforts to rationalize society (Eyerman, 1985). The self-imagination of Sweden as a stronghold of progressive politics – a ‘moral superpower’ has proven an enduring, powerful concept (Nilsson, 1991; Pred, 2001; Dahl, 2006).
If citizens trust the state to collect data for the social good, then, as Maj, a microdata manager at Statistics Sweden, explained, perhaps this is because the state is widely seen as trustworthy: “one thing is that we did not take part in the Second World War. We have not had any scandals with registries that have been misused. I think that's a very big … because in other countries, they are very afraid of being registered. There is no such fear, or very little of that kind of fear in Sweden.”3
Sweden’s borders, then, have demarcated the limits of this sense of trust. Until very recently, Swedish data protection laws ensured that national data could not leave the country. As Ivar, a data advisor at Statistics Sweden explained to me, researchers “can’t even bring it [data] and work with it abroad. It should stay in Sweden only … If they bring the data abroad, the people in the database are not protected by our laws anymore, and you know, in the United States it could be sold for research there, and then it wouldn’t be protected by laws or anything – it could be sold or used for whatever … But there are many researchers abroad that want [Swedish] data.” In Sweden, the data produce social benefits through circulation and use for research and governance. Yet, in this framework, the legality – and, I would argue, morality, of Swedish data circulation – has traditionally been guaranteed only as long as the data are managed by the state and contained within Sweden, which are now increasingly Sisyphean tasks. This protective regulatory regime was designed before the widespread proliferation of the internet, and long before the advent of global networks, cloud computing and easy masking of file origins.
Ivar continued, “the general public doesn’t even know how much data we collect and have here [at Statistics Sweden]. Of course, we protect it very well … . But I think people are quite used to having all of this data collected on them, wherever they go, more or less. […] And then again, they don’t really see how much use all the data we have brings to the society in general … in medical research or any kind of research, and mostly it’s for social planning, for government … . Everything from regulating interest rates to basically the structure of infrastructure is comprised from our data.”
Ivar’s remarks suggest that Sweden’s national data can be seen as literally constitutive of the Swedish state, as data-driven research contributes to rational governance and an efficient welfare system. These ideas are caught up in what LifeGene describes as the scientific motivation of the Swedish population, one of the attributes that signal the great promise of LifeGene and other Swedish Big Data projects. However, scientific motivation or trust in the state cannot be taken as a given, and should instead be seen as tenuous achievements of a particular political system and national history. This became particularly apparent in 2011, when LifeGene was unexpectedly shut down after being found to be in violation of terms of the Swedish Personal Data Act.
LifeGene and the Morass of Swedish Data Protection Laws
From the beginning, the directors of LifeGene were attentive to the importance (and complexity) of governance issues, drafting a detailed ethics policy to ensure compliance with Swedish and international legislation concerning research on human subjects and personal data, and establishing a LifeGene Ethics Council (LifeGene, 2009). Because LifeGene would include participants from birth to age 45, collect data and biological samples to be stored in a biobank, and link to existing national databases, LifeGene would need to comply with laws focused on data protection, privacy and research on human subjects. While the legal issues were complicated, the Regional Ethical Review Board of Stockholm approved the pilot and initial rollout of LifeGene.
The regional board’s approval included a caveat that samples could not be collected from children under the age of 6. Because of this caveat, the Karolinska Institute, the host university of LifeGene, subsequently appealed the regional decision to Sweden’s Central Ethical Board (SOU, 2014, p. 345). The inclusion of younger participants was important because LifeGene – in contrast to most large-scale biobank projects, which focus on chronic diseases – aimed to facilitate research on early-life disorders.4 For this reason, LifeGene used a household-based design, recruiting index persons aged 18–45 and inviting them to include partners and children in the project. This structure “will give the opportunity to involve young couples prior to and during pregnancy, allowing for the first study of children born into cohort with complete pre- and perinatal data from both the mother and father” (Almqvist et al, 2011, p. 68). The “born into cohort” would make it possible for LifeGene to collect data, samples and medical records for those individuals throughout their entire lives.
Interestingly, neither LifeGene’s proposal of lifelong sampling and surveillance nor the inclusion of children as research subjects emerged as the primary ethical concern about the project. Instead, when Sweden’s Central Ethical Board reviewed LifeGene’s application, they concluded that LifeGene was primarily a database, rather than a research project. For this reason, the Central Ethical Board decided that LifeGene did not fall under the purview of Sweden’s Ethical Research Act, which concerns research projects, and thus the board recused itself from reviewing the application or making a decision about the inclusion of children in the project.
LifeGene’s collection and processing of personal data also brought it under the purview of the Data Inspection Authority (DIA), a public authority established in 1973 to ensure compliance with Swedish data laws (DIA, n.d.). The DIA found that LifeGene’s collection of personal data for unspecified future research violated Sweden’s Personal Data Act of 1998, which states that personal data can only be collected for particular, specified and authorized reasons, and cannot be stored for general purposes (Datainspektionen, 2011). The DIA’s decision thus ruled out the possibility of banking on anticipation (Adams et al, 2009) or potentiality (Taussig et al, 2013).
The premise has therefore been that official registers with a large number of people registered and with especially sensitive content will be regulated by law […] The government decides, through regulation, which health data registers can be formed. A characteristic of the registers is that they have nation-wide coverage. The DIA’s opinion is that national databases for research purposes should not be constructed apart from the health data registers. In any case, this should not occur without the parliament or government having taken a position on the issue.
(Datainspektionen, 2011, pp. 4–5, my translation)
Here, the DIA pointed to the uncertain legal status of a large-scale national database that stood apart from the databases managed by the government. LifeGene was in some ways comparable to Sweden’s health data registers and other official registers, and would include data from these registers (for example, when a research subject gave consent for LifeGene to access his or her medical records). LifeGene occupied an ambiguous position; it was publicly and privately funded, and fell somewhere between research (requiring ethical approval) and infrastructure (properly the domain of government). It was unclear how LifeGene should fit within legislation regulating official registers and research based on data compiled within those official registers, and, given the complex tangle of general and special data protection provisions in Swedish legislation, the DIA called for the Parliament to conduct a general review of relevant laws (Datainspektionen, 2011). In this sense, the DIA signaled that LifeGene might represent a new kind of legal object in Sweden: a national-scale research infrastructure developed outside of the traditional governmental and administrative systems of recordkeeping and registration. While LifeGene was by no means the first research database or biobank in Sweden to be administered by a university or company, the difference appeared to be the scale of LifeGene’s data collection, which potentially rivaled that of the official national registers. Here, the global reach of the LifeGene project became a cause for concern, as a massive collection of Swedish population data became unmoored from the welfare state that had long legitimated the collection of data on a national scale.
The result of the DIA’s decision was that in 2011 Karolinska had to cease the collection and processing of personal data associated with LifeGene. At the time, the DIA’s ruling seemed to indicate that national legal procedure outweighed any sense of “investment in the future.” In 2013, Iceland’s Data Protection Agency followed suit, ruling that deCODE’s proposed plan to mine genealogical data, health records and genomic data to impute relationships between participants and other Icelanders would require informed consent from individuals linked to participants through these techniques (Kaiser, 2013).
The Uncertain Status of LifeGene
Karolinska, on behalf of LifeGene, appealed the DIA’s decision. This marked the beginning of almost 2 years of lobbying, debates and reports. During this period, LifeGene, which had previously garnered little public attention, became a fulcrum for media discussions about the proper balance of data protection policies and future-oriented biomedical research. The LifeGene debates fit into a longer tradition of public discussions about data, privacy and scientific research in Sweden that offer an important counter-narrative to the idea of trust in the state that the researchers and data managers I interviewed referred to, and that forms one of the underlying assumptions in LifeGene’s descriptions of Sweden as a “research-friendly setting” characterized by a “scientifically-motivated population.”
Before the LifeGene case, one of the most heated debates about research data in Sweden centered about the Metropolitan Project, a sociological study that included around 15 000 research subjects (Qwerin, 1987; Janson, 2006). The Metropolitan Project became the subject of controversy in 1986, when the major newspapers reported on a “secret study,” researchers who “knew everything,” and a “crime against privacy and democracy” (Cool and Hoshor, 2012). The project was seen as problematic because the research subjects had not been clearly informed about how much information was being collected, for how long or for what purpose. These public debates brought new attention to privacy issues, and ultimately led to the DIA’s decision to destroy the key tape containing the identifiable data in the summer of 1986 (Cool and Hoshor, 2012). In 2003, an impassioned media debate broke out after Anna Lindh, the Minister for Foreign Affairs, was assassinated, and a blood sample from the national PKU biobank was used for DNA analysis that identified her killer. In these discussions, the view that forensic use of this medical resource would erode public trust in biobanks eventually won out (Hansson and Björkman, 2006).
In the LifeGene debates, journalists and critics pointed to the lack of transparency about how data would be used and protected, asking: “How is data protected against commercialization and unholy alliances between insurance companies, employers, and recruiting agencies? What happens to privacy when data is cross-pollinated in an ever-denser forest of registers? What constitutes ethically approved research on the information given by participants?” (Johannisson, 2012). In February 2012, Education Minister Jan Björklund announced that new laws would be put in place to allow LifeGene to continue, noting that “Sweden will be world-leading within register-based research, but we must become better at exploiting the gold mine we have” (Utbildningsdepartamentet, 2012, my translation). In response, journalists questioned the logic of “if research doesn’t follow the law, then the problem is with the law” (Nilsson, 2012, my translation) and pointed out that rather than the complex legal issues, the difficult considerations about privacy around LifeGene could just as easily be seen as the root of the problem (Dagens Nyheter, 2012). In 2013, the legal issues were resolved through the passage of a temporary law (SFS, 2013, p. 794), which allowed LifeGene’s work to continue. However, the legal resolution did not conclusively settle the public concerns about privacy and transparency that had been articulated during the period of debate. In contrast to the earlier debates about the Metropolitan Project and the PKU biobank – where research subjects’ privacy and public trust in medical research were considered paramount values, leading to the destruction of data or restriction on usage of existing data collections – in the case of LifeGene, the potential value of future research was prioritized, leading to continued data collection.
A New Law for LifeGene
The bill for the new law, “Registers for research on what nature and nurture mean for human health” (sometimes referred to as the LifeGene law), concerned biomedical research registers managed by universities. The bill describes LifeGene as building a foundation for studying how genes, environment and lifestyle influence health, and identifying the causes of the most common diseases affecting Swedish citizens. The bill further sets out that data collected by LifeGene “also constitutes a resource for answering questions in the future which we can not fully formulate today” (Regeringens Proposition, 2013, pp. 19–20, my translation). Thus, the bill suggests that the biomedical benefits that LifeGene might offer in the future are of such significance to justify a major overhaul of personal data and research legislation. The laws would shift to accommodate the promise of LifeGene and Big Data, resonating with Adams et al’s broader observation that the biomedical “present is governed, at almost every scale, as if the future is what matters most” (2009, p. 248).
With the temporary law in effect, LifeGene was able to resume collecting data in late 2013. Participants were asked to give informed consent for data and samples to be collected, stored and used in future health-related research approved by the ethical review board, including research outside of Sweden and the EU – a striking change, given the formerly strict restrictions on personal data circulation beyond Sweden. Participants were also asked whether they would give LifeGene permission to access their medical records for health-related research.
Biological samples would be processed and stored in a biobank using standardized protocols to enable complex molecular biological analyses including proteomics, metabolomics, genomics, epigenomics and toxicomics. What the omics have in common is that data from biological samples can be aggregated into large data pools. Conventional forms of data processing and analysis are inadequate for data sets of this size and complexity. Instead, Big Data tools like data mining and machine learning algorithms are used to find patterns and identify possible research questions.
Evald offered this explanation: “the different omics – like you have genomics and proteomics and metabolomics and everything – it’s just another word for hypothesis-free searching of a large space.” While the idea of data mining as hypothesis-free is popular, it is worth noting that, as Hallam Stevens has argued: “This notion of letting the data ‘speak for themselves’ is no doubt a problematic one: all kinds of models, assumptions, and hypotheses necessarily intervene between a measurement and scientific fact. Yet it accurately portrays what many biologists think they are doing in bioinformatics” (2013, pp. 68–69). However, Big Data analytics is inductive; algorithms are used to search for patterns and relationships that would be impossible to see without the aid of automated techniques. Thus, in contrast to older forms of data analysis, it can be more difficult to predict what kinds of research questions and results will emerge from a particular data set.
The uncertainty inherent in Big Data analytics requires a more expansive approach to informed consent. As LifeGene’s ethics policy explains, “Participation will be presented as an opportunity to contribute to a resource that may, in the long term, help enhance other people’s health. Because it will be impossible to anticipate all future research uses, consent will be sought for research in general that is consistent with LifeGene’s stated purpose” (LifeGene, 2010, p. 7). LifeGene’s access to personal data in national registers is also framed broadly: “LifeGene wants to track health events, the development of disease, and the course of treatments. […] The range of different records that can be accessed will be determined by developments in the health service electronic records systems […] LifeGene will not be able to say in advance which data from these various records will be needed” (LifeGene, 2010, p. 9). Participation now requires permission for future, unspecified health-related research as approved by the future ethical review board (Helgesson and Eriksson, 2011; Helgesson, 2012). While opting out is theoretically possible, as participants can withdraw at any time, they will not be recontacted with each use of their personal data covered under future research. For participants who withdraw, “such a withdrawal would prevent information about them from contributing to future longitudinal analyses, but would not be feasible to remove their data from analyses that have already been done” (LifeGene, 2010, p. 10). In this sense, it may be impossible to guarantee that data, once collected and analyzed, can be sufficiently contained in order to be destroyed.
Participants are also asked for permission to access and process multiple forms of personal data, from medical records (including those which do not yet exist) to omics data, which reflects LifeGene’s orientation toward Big Data analytics, and is indicative of a transformation in informed consent now underway as the meanings of both data and research are increasingly abstract (Hogle, this volume; Schadt, 2012; Ioannidis, 2013). This abstraction and anticipation is built into the consent process, as the scope of access expands into an unknown future and moves beyond the borders of Sweden and the EU.
Building Infrastructure for Big Data and Big Value
Sweden’s long-standing collection of data about its population has traditionally been closely tied to the political logic of the universalist welfare state – what Evald referred to above as “the political system we have had where the state takes care of people.” In this welfare-state imaginary, science and the state work hand-in-hand to derive rational, data-driven policies for the public good. In contrast, LifeGene’s collection of information and samples has been legitimized through the logic of Big Data and Big Value, or the seemingly limitless social and scientific knowledge and economic growth Big Data promises to deliver. But what, exactly, does LifeGene offer, and for whom?
The anticipated social benefits of Swedish biobank-based research have been summarized as “increasing knowledge (to improve health) and decreasing health-care costs (via more efficient therapies)” (Nobel, 2008, p. 40). LifeGene’s areas of special focus include infectious disease, inflammation and allergy, cancer, metabolic and cardiovascular disorders, neuropsychiatric diseases, and muscoskeletal diseases. Researchers affiliated with LifeGene wrote: “we will be able to assess exposures and disease progression for a myriad of disorders that have considerable consequences not only for health care demand, but also for future development of chronic disease.” This expectation is linked to the hope that data mining will reveal “biomarker profiles for early detection or a basis for disease classification, prognosis, and treatment prediction” (Almqvist et al, 2011, pp. 75–76). These hopes for LifeGene’s potential contributions come at a time when efficiency and cost-effectiveness are among the top priorities in Swedish health policy. In 2009, health-care expenditures accounted for 9.9 per cent of the GDP, and Sweden’s population is now one of the world’s oldest (with 18 per cent of the population over the age of 65), suggesting future increases in health costs and increased strain on the welfare system (Annell et al, 2012). Thus LifeGene’s hopeful predictions about the future of healthcare seem to offer one way to counterbalance pessimistic assessments of the sustainability of Sweden’s welfare model in light of the demographic transition.
Yet, the challenge is that future health benefits of LifeGene’s data mining are expected rather than guaranteed, while concerns about data protection exist in the present. Furthermore, although rarely explicitly stated, LifeGene is directed not only toward tracing the etiology of diseases, but also at enabling the development of potentially lucrative interventions in individual health. LifeGene, along with the larger infrastructural push toward Big Data in Swedish biomedicine, was part of a national project initiated in 2009 intended to shore up Swedish research through strategic investment. Looking at this larger context reveals some of what is at stake for Sweden in its bid for Big Data.
Sweden’s efforts to improve national infrastructures for research and innovation are driven in part by anxieties about international competitiveness in the global economy. The project is described in the 2008 bill, “A boost for research and innovation,” which sets out that while Sweden has one of the highest rates of investment in R&D in the world (measured in relation to GNP), “for Sweden to be able to uphold its place in international competition” public investment must focus on research “at the highest international level” and “of significance for society’s development and industry’s competitiveness” (Regeringens Proposition, 2008, p. 57, my translation). The bill also notes that, “to a large extent, Sweden’s welfare depends on innovative and research-initiated industries such as pharmaceuticals, metal, forestry, and electronics” (Regeringens Proposition, 2008, p. 34, my translation). Sweden’s desire to expand its science-based and high-tech sectors can be traced back to the early 1990s, when a changing view of the Swedish growth model led to the identification of increased research expenditure as a key to economic expansion (Benner, 2003, p. 141). This political direction has been shared to a greater or lesser extent across the Nordic region; in Iceland, for example, initial support for deCODE and the Health Sector Database reflected both national fears about economic survival and the political embrace of science and technology as potential engines of growth (Winickoff, 2006, p. 88).
The SRC was given responsibility for national coordination of research infrastructures. The SRC identified “research involving personal registers” as a strategic investment area, describing the “substantial competitive advantages” that Sweden’s registers and biobanks offer for national biomedical research. (SRC, 2012, p. 39). To expand on this perceived strength, the SRC recommended development of high-throughput biobanking capacity, large-scale computing platforms, high-speed data networks, resources for data storage and analysis, and expanding expertise in bioinformatics.6 These efforts would update national registers by making it possible to link them to the enormous volumes of omics data newly available.7
LifeGene, as a Big Data project, would not have been possible without the development of this infrastructure. Recent efforts to transform Swedish research infrastructure, however, also fit into a longer Western history of distributed research practices in biomedical research (Keating and Cambrosio, 2000). Yet, the SRC’s push to link existing registers with new forms of data was also informed by the allure of Big Data and the socially and economically valuable biomedical insights it could potentially reveal. As described in the SRC’s report, this restructuring would “[enable] Sweden to contribute to translational research by transferring the findings from basic research into developing new tools for early diagnostics, prevention, and individualized treatment” (SRC, 2012, p. 32), with the aim of making “changes in the organization of health services so that new knowledge is more easily accessible in clinical practice” and “improv[ing] the quality of life in humans and concurrently reduc[ing] society’s costs for health and health services” (SRC, 2012, p. 62).
Sweden’s data-driven reframing of biomedicine thus aims to find efficient and economical solutions to national problems of health and welfare by translating information into predictions and interventions. At the same time, LifeGene and its surrounding infrastructure are seen as offering a potential route to national economic growth for years to come through commercial partnerships and the potential development of diagnostic tools and pharmaceuticals.8 As LifeGene notes in its ethics policy, “the biotechnology and pharmaceutical industries can play an important role in realizing health benefits by developing and improving the use of biomedical products. Commercial companies and other research endeavors that stand to make a profit will, therefore, be allowed access to LifeGene” (LifeGene, 2010, p. 23). From this perspective, LifeGene offers Sweden both potential health benefits and projected economic returns: the national health-care system would become more effective and efficient, while the national economy would be bolstered by industrial applications.
In one sense, the project of managing populations and rationalizing everyday life is hardly new in Sweden, where the technocratic legacy of the social engineers is built into the structure of the welfare state. However, while benefits are imagined at a national level, the path to their realization requires the mobilization of population data without regard to national borders. Further, Sweden’s national data registers have traditionally been used for policy making and research with social applications, and thus are intimately connected with the welfare state. In contrast, LifeGene offers potential future benefits for the Swedish population, but as a Big Data project is necessarily oriented toward the general and the global.
Sweden’s infrastructural bid for Big Data draws on ideas of national characteristics – a scientifically motivated population, comprehensive databases, a universal health-care system – at the same time as it reaches beyond Sweden’s borders by standardizing protocols and technologies with an eye toward transnational collaboration and commercialization. If the dream of national data was the social engineer’s pragmatic, collectivist vision, the dream of Big Data draws on a font of optimism, optimization and ever-expanding ambition – to say nothing of capital investments throughout Northern Europe. In this sense, national pride in protection of Swedish data is already outmoded.
As Swedish research infrastructure is increasingly tailored toward enabling data-sharing by increasing compatibility with European and international research infrastructures, the national legal and ethical framework that governed the older population-based registers becomes less relevant to contemporary data use. Swedish biomedical data have already grown large enough to require national genomics and bioinformatics infrastructures that can provide computing clusters, memory and storage space beyond the capacity of Sweden’s largest universities (SRC, 2014). Work is now underway on a standardized informatics framework for biobanks in 23 European countries, including Sweden (BiobankCloud, 2012).
At the same time, the EU has been moving closer to instituting a new data protection directive to further standardize data legislation, coordinating consumer protection while allowing data to move more freely across EU borders. The stakes of EU data legislation are high; the European Commission has noted that, “according to some estimates the value of European citizens’ personal data has the potential to grow to nearly €1 trillion annually by 2020” (European Commission, 2014). In this context, decisions are being made in the present with the understanding that the value of Swedish data will soon skyrocket, with Bigger Data promising even Bigger Value. However, the focus on the potential of future knowledge and value highlights the need to resolve the ambiguities of governing Swedish data practices without offering indication of how this might happen. The construction of technological infrastructures for Swedish Big Data has progressed rapidly, and signs point to legal resolution in the near future, as Sweden negotiates the challenges of implementing the new European directive on data protection. As a Big Data project, LifeGene heralds a broader change in data collection, use and protection away from Sweden’s traditional state-managed and nationally bounded model. At the same time, LifeGene’s evocation of Sweden’s “scientifically motivated population” is indicative of how an older model of trust in science and the welfare state continues to be leveraged in support of the anticipated Big Value of Big Data in Sweden. Yet, as Swedish data legislation is restructured to accommodate the orientation of Big Data projects like LifeGene toward the general and the global, it is worth asking if the anticipated Big Value of such projects will accrue within the limits of national borders that can longer contain the flow of citizens’ personal data.
Funders include the Swedish Research Council, the Karolinska Institute, AFA Insurance, and the Ragnar and Torsten Söderberg Foundations.
All names used are pseudonyms.
Sweden’s role in WWII was more complicated than popular narratives of neutrality might suggest. See Gilmour (2011) for a thoughtful analysis. However, elsewhere in Europe, the postwar legacy of atrocities enabled by unchecked state power and genocide committed in the name of social engineering would certainly lend a more ominous note to national efforts to build comprehensive population-wide registers.
Sweden’s 1998 Personal Data Act was based on the European “Data Protection Directive,” the European Parliament and Council Directive 95/46/EC from 24 October 1995. Sweden, as a member of the EU, is a member of the Data Protection Convention.
Biomedical infrastructures include the Swedish Initiative for Research on Microdata in the Social and Medical Sciences (SIMSAM), the Swedish National Data Service (SND), Biobanking and Molecular Resource Infrastructure of Sweden (BBMRI.se), Bioinformatic Infrastructure for Life Sciences (BILS), the National Genomics Infrastructure (NGI), and the Science for Life Laboratory (SciLifeLab), among others.
Sweden’s personal identification number system would also make it possible to link LifeGene with civil registries, although the informed consent specifies that only health-related data will be accessed. In Sweden, linking biobanks to civil databases such as police registers may be particularly sensitive. See Hansson and Björkman, 2006.
LifeGene does not profit from research it enables, but individual researchers or corporate partners can commercialize products of their research, including any patentable inventions.
This work was supported by funding from the National Science Foundation (award number BCS-0921847), the Social Science Research Council, and the Fulbright Program. A fellowship at the Brocher Foundation in 2011 allowed me to begin preliminary work on this project. I would like to thank Rayna Rapp and Linda Hogle, the editors of this special issue, for their invaluable feedback on this article.
- Anell, A., Glenngard, A.H. and Merkur, S. (2012) Sweden: Health system review. Health Systems in Transition 14(5): 1–159.Google Scholar
- Austin, M.A., Harding, S. and McElroy, C. (2003) Genebanks: A comparison of eight proposed international genetic databases. Community Genetics 6(1): 37–45.Google Scholar
- BioBankCloud (2012) BioBankCloud: Your PaaS for biobanking, http://www.biobankcloud.eu, accessed 15 May 2014.
- Bygrave, L.A. (2010) Privacy and data protection in an international perspective. Scandinavian Studies in Law 56: 165–200.Google Scholar
- Cool, A. and Hoshor, A. (2012) Swedish consensus politics: Between technocracy and public participation. Paper presented at Society for Social Studies of Science Annual Meeting; 19 October, Copenhagen, Denmark.Google Scholar
- Dagens Nyheter (2012) Mål utan debatt om medel. 5 March.Google Scholar
- Dagens Nyheter (2013) Gränslös övervakning. 7 September, http://www.dn.se/ledare/huvudledare/granslos-overvakning/, accessed 2 June 2014.
- Daston, L.J. and Galison, P.L. (2007) Objectivity. New York: Zone Books.Google Scholar
- Datainspektionen (Data Inspection Authority) (2011) Tillsyn enlight personuppgiftslagen (1998:204), PuL. LifeGene. 16 Dec., Diarienr 766–2011, at http://www.datainspektionen.se/Documents/beslut/2011-12-19-lifegene.pdf, accessed 15 May 2014.
- Datainspektionen (Data Inspection Authority) (n.d.) About us, http://www.datainspektionen.se/in-english/about-us/, accessed 12 June 2014.
- Eaker, S. (2012) Biobanks in health care in Sweden. Biobank Sweden, October, Newsletter no. 7.Google Scholar
- Ehn, B., Frykman, J. and Löfgren, O. (1993) Forsvenskningen av Sverige. Det nationellas forvandlingar. Stockholm, Sweden: Natur och Kultur.Google Scholar
- European Commission (2014) Progress on EU data protection reform now irreversible following European Parliament vote, 3 December, http://europa.eu/rapid/press-release_MEMO-14-186_en.htm, accessed 15 May 2014. European Commission MEMO/14/186.
- Fortun, M. (2008) Promising Genomics : Iceland and deCODE Genetics in a World of Speculation. Berkeley, CA: University of California Press.Google Scholar
- Görnerup, O., Gillblad, D., Holst, A. and Bjurling, B. (2012) Big Data analytics: A research and innovation agenda for Sweden. Stockholm, Sweden: Vinnova. http://www.vinnova.se/PageFiles/0/Big%20Data%20Analytics.pdf, accessed 8 March 2015.
- Hirdman, Y. (1989) Att lägga livet tillrätta: studier i svensk folkhemspolitik. Stockholm, Sweden: Carlsson.Google Scholar
- Johannisson, K. (2012) Rebecca Skloot: “Den odödliga Henrietta Lacks.” Dagens Nyheter 23 January.Google Scholar
- Jonsson, J.L. (2014) Våra digitala fotspår kan ge smartare städer Dagens Nyheter 16 February, pp. 20–21.Google Scholar
- Kielos, K. (2013) En öppen värld – för genierna. Aftonbladet, 23 June, http://www.aftonbladet.se/ledare/ledarkronika/katrinekielos/article17006733.ab, accessed 12 May 2014.
- Lagerwall, K. (2013) Använd inte sociala medier för seriösa saker. Dagens Nyheter 12 June, http://www.dn.se/ekonomi/anvand-inte-sociala-medier-for-seriosa-saker/, accessed 15 June 2014.
- LifeGene (2009) LifeGene ethics policy: Version 3.2, February, http://lifegene.ki.se/ethical_issues/documents/090331LifeGeneEthicsPolicyv32.pdf, accessed 15 May 2014.
- LifeGene (2010) LifeGene: A unique initiative for health. August, https://www.lifegene.se/PageFiles/102/general_lg_presentation_aug_2010.pdf, accessed 15 May 2014.
- LifeGene (n.d.) “LifeGene Rationale,” https://www.lifegene.se/PageFiles/102/20090206_lifeGene_Rationale.pdf, accessed 15 May 2014.
- Mayer-Schönberger, V. and Cukier, K. (2013) Big Data: A Revolution that Will Transform How We Live, Work, and Think. Boston: Houghton Mifflin Harcourt.Google Scholar
- Myrdal, A. and Myrdal, G. (1934) Kris i befolkningsfrågan. Stockholm, Sweden: Bonnier.Google Scholar
- Nilsson, A.S. (1991) Den moraliska stormakten: en studie av socialdemokratins internationella aktivism. Stockholm, Sweden: Timbro.Google Scholar
- Nilsson, J. (2012) Stoppat super-register mot cancer ska räddas. Dagens Nyheter, 28 Feb. Available at http://www.dn.se/nyheter/politik/stoppat-super-register-mot-cancer-ska-raddas/, accessed 22 June 2015.
- Nobel, S. (2008) Biobanks – Integration of Human Information to Improve Health. Stockholm, Sweden: Committee for Research Infrastructures and the Scientific Council for Medicine at the Swedish Research Council. Swedish Research Council Report Series no.10. Available at http://vr.se/download/18.235f40c212384f2ca668000177/1340207439890/Biobanks_2008_10.pdf, accessed 25 July 2014.
- Öman, S. (2010) Trends in data protection law. Scandinavian Studies in Law 56: 210–205.Google Scholar
- Qwerin, G. (1987) Metropolit i massmedia. Sweden: National Council for Crime Prevention, BRA Forskning, 4.Google Scholar
- Rebas, K. (2013) Det är du som är produkten. Dagens Nyheter 9 August, http://www.dn.se/ledare/kolumner/det-ar-du-som-ar-produkten/, accessed 14 June 2014.
- Regeringens Proposition (2008) Ett lyft för forskning och innovation (A boost for research and innovation). Stockholm, Sweden, 20 October. Prop. 2008/09:50.Google Scholar
- Regeringens Proposition (2013) Vissa register för forskning om vad arv och miljö betyder för människors hälsa (Some registers for research on what nature and nurture mean for human health). 23 May. Prop. 2012/13:163.Google Scholar
- Rothstein, B. (2006) Vad bör staten göra?: om välfärdsstatens moraliska och politiska logik. Stockholm, Sweden: SNS Förlag.Google Scholar
- Sejersted, F. (2011) The Age of Social Democracy: Norway and Sweden in the Twentieth Century. Princeton, NJ: Princeton University Press.Google Scholar
- Shapin, S., Schaffer, S. and Hobbes, T. (1985) Leviathan and the Air-pump: Hobbes, Boyle, and the Experimental Life. Princeton, NJ: Princeton University Press.Google Scholar
- Statens Offentliga Utredningar (SOU) (2014) Unik kunskap genom registerforskning, 25 June, Stockholm, Sweden: Utbildningsdepartamentet, SOU 2014:45.Google Scholar
- Swedish Research Council (SRC) (2012) The Swedish Research Council’s guide to infrastructures 2012: Recommendations on long-term research infrastructures by the research councils and Vinnova. Stockholm, Sweden: Swedish Research Council. Vetenskapsrådets Rapportserie, 3.Google Scholar
- Swedish Research Council (SRC) (2014) Swedish Science Cases for E-Infrastructure, 15 March, Stockholm, Sweden: Swedish Research Council.Google Scholar
- Svensk författningssamling (SFS) (2013) Lag (2013:794) om visa register för forskning om vad arv och miljö betyder för människors hälsa, 24 October, Stockholm, Sweden: Utbildningsdepartementet.Google Scholar
- Traweek, S. (1992) Beamtimes and Lifetimes: The World of High Energy Physicists. Cambridge, MA: Harvard University Press.Google Scholar
- Utbildningsdepartamentet (2012) Nya beslut om medicinsk forskning (pressmeddelande). Stockholm, Sweden: Department of Education, 28 February.Google Scholar
- Waldby, C. (2002) Stem cells, tissue cultures and the production of biovalue. Health 6(3): 305.Google Scholar
- Watson, I. (2010) A short history of national identification numbering in Iceland. Bifröst Journal of Social Science 4: 51–89.Google Scholar