DNA Databases and Big Data

Criminal DNA databases are expanding in different regions of the world to support the activities of the criminal justice system. The use of techniques that combine different sources of digital information for preventing and anticipating the risk of crime (one of the potential uses of so-called Big Data) is increasingly seen as a promising strategy to govern crime. This chapter provides an overview of the development of technological systems orientated towards genetic surveillance of criminalized populations. It also outlines a comprehensive mapping of the main ethical, social and political challenges related to the growing uses of DNA databases and Big Data at a global scale.

2013). These databases may contain genetic profiles of convicted persons, suspects, victims, volunteers and other persons of interest, in order to conduct criminal investigations.
A database provides a matrix of genetic profiles based on biological samples collected from a set of individuals. In the context of a pending criminal investigation, traces found in the crime scene or on the victim's body may be analysed and the resulting DNA profiles will be compared with those included in the genetic forensic database, thus making it possible to identify the origin of this vestige, in the event of a positive match.
The process of creating forensic databases with genetic profiles began in the mid-1990s. The first forensic genetic database was set up in England and Wales in 1995, and countries such as the Netherlands (1997), Austria (1997) and Germany (1998) followed suit. It is estimated that there are now 69 countries around the world operating this type of database and that at least 34 countries are starting the process of implementing their own database (Interpol, 2016;Prainsack & Aronson, 2015). Such databases exist in different regions of the world, especially in Europe and North America: however, recent developments point to growing expansion in Asia, in particular in China, India and South Korea (Forensic Genetics Policy Initiative, 2017).
The creation of databases to support criminal investigation is aligned with the social, economic and political context of the so-called information society, which many authors consider to be a society of maximum surveillance that began to emerge in the mid-1980s (Boersma, Van Brakel, Fonio, & Wagenaar, 2014;Garland, 2001;Lyon, 1992Lyon, , 2006Marx, 2002;Norris & Armstrong, 1999). The phenomenon of Big Data emerges in the context of technological development and the growing importance of the digital world, which is associated with large-scale collection of citizens' data. It can be defined as a technique that aggregates and analyses a massive amount of data, converting it into algorithms, numerically categorized and identified by employing a calculated index, from which information may be extracted. The technique can be applied in several spheres of social life, including commerce, consumption, health, social security, marketing and immigration. In the context of this book, the authors will pay careful attention to expectations associated to the potential of applying Big Data in the fields of criminal investigation and security (Brayne, 2017;Chan & Moses, 2015Tsianos & Kuster, 2016). The complexities and challenges arising from the use of forensic DNA databases and Big Data in the context of criminal investigation will be presented and briefly discussed in the different sections of this chapter.

ethiCal QueStionS aSSoCiateD with the uSe of forenSiC Dna DatabaSeS
It is now widely recognized that forensic genetic databases can be beneficial for criminal investigation activities and the production of evidence in the justice system and may eventually contribute to crime prevention and deterrence (Santos et al., 2013;Walsh, Buckleton, Ribaux, Roux, & Raymond, 2008). However, the use of such databases raises diverse and complex ethical, social and political questions which, from our perspective, must be considered in the context of suitable involvement of various social actors: legislators, judicial operators, forensic experts, politicians (Machado & Silva, 2015a, 2015bWienroth, Morling, & Williams, 2014). Commentators from different professional fields and scientific disciplines have pointed out the need to consider that the use of forensic genetic databases should be conducted while considering ethical concerns and the need to respect fundamental human rights, such as freedom, autonomy, privacy, presumption of innocence and equality (Amankwaa & McCartney, 2018;Krimsky & Simoncelli, 2011;Van Camp & Dierickx, 2007). The most controversial ethical issues related to DNA databases for criminal investigation above all concern the criteria related to the selection of the DNA profiles to be included and to the collection, conservation, use and circulation of data. This question is highlighted by the clear trend towards the growing use of such databases. There are other aspects that may raise ethical issues, which will be listed below (Hindmarsh & Prainsack, 2010;Prainsack & Aronson, 2015).
The myth of the infallibility of DNA profiling may lead to overlooking possible laboratory errors and other errors and result in the marginalization or even elimination of other types of evidence in court. Identification errors can have profound and irremediable implications, so guaranteeing quality in all technical procedures is also an ethical issue.
There is the possibility of establishing kinship ties through familial searching (see Chap. 7), even though this information might even be unknown to the person who is registered or may constitute a breach of the individual's private life and moral integrity.
Forensic DNA databases reproduce and reinforce social inequalities. Members of specific minorities are more likely to be included in forensic DNA databases and then, consequently, placed under greater surveillance (Chow-White & Duster, 2011;Skinner, 2012Skinner, , 2013Skinner, , 2018. The seminal work of Robin Williams and Paul Johnson (2004) is essential in this respect for its analysis of the unique nature of the surveillance based on DNA data and its implications for the construction of suspicion. The authors argue that DNA databases allow for "reconstructive surveillance", forming a circuit surveillance system which holds information that can be applied retrospectively, meaning that people and their actions are not watched, but are inferentially reconstructed using expert practices (Williams & Johnson, 2004, pp. 3-6). As the authors note, "DNA databases have a speed, efficiency, automation, and accuracy that are unmatched in the history of policing" (Williams & Johnson, 2004, p. 8). Moreover, Williams and Johnson explain that DNA databases form "a type of surveillance which is essentially concerned with 'management' of those already deemed criminal […] delimiting them from the wider population and managing them through assured detection" (Williams & Johnson, 2004, p. 11).
Finally, it is important to note that there are high costs associated to creating and maintaining a DNA database, and there are no studies that provide consistent evidence of its efficacy, utility and deterrence effect (Toom, Granja, & Ludwig, 2019). Do the benefits of this technology justify this investment? In other words, can it be argued that these resources will be better applied in crime prevention policies, social reintegration of offenders and/or in ways to reinforce protection for the most vulnerable segments of society?
In 1997, Derick Beyleveld, a specialist in Law and Ethics, proposed the following systematization of what he called the enthusiastic model ("camp enthusiastic") and the pessimistic model ("camp hostile") concerning weighting the risks and benefits associated with the use of forensic DNA databases. This proposal of general models is a mere abstract construction, which selectively accentuates certain aspects of concrete reality (Beyleveld, 1997).
The enthusiastic model of the use of DNA databases within criminal justice appears to be based on a model of criminal justice that focuses on identifying and punishing offenders and deterring crime. It is accepted that, in principle, all individuals may be guilty, and that one of the purposes of the justice system is to find out who actually committed crimes and then punish them. In relation to the normative question of the relationship between the collective good and the individual good, this position is guided by affirmation of the relative superiority of the interests of the community, considering that upholding people's safety and combatting crime are common goods that justify placing constraints on individual rights. From this perspective, emphasis is placed on greater effectiveness in identifying guilty persons and valuing a society with more effective structures in controlling individuals and ensuring safety (Beyleveld, 1997).
The pessimistic model emphasizes the potential risks and drawbacks of using DNA databases in the criminal justice system and understands that the primary purpose of the justice system is to uncover the truth and protect the rights of innocent people. It is accepted that, in principle, defendants are presumed to be innocent until proven guilty. There should, therefore, be special attention placed on procedures that protect defendants against the possibility of error and ensure equal access to evidence, both for defence and for prosecution. This position broadens the reflection on the possible harmful consequences for democracy that can be created by a society that chooses people's safety as a supreme good, though the extension of the criteria for inclusion of information in a DNA database may prove to be inadequate and disproportionate to the potential benefits (Beyleveld, 1997).
It should be noted that it is difficult to find empirical evidence of an extreme position, either in legislative, political or expert terms or of what may be the simple assumptions of ordinary citizens (Machado & Silva, 2015b;Williams & Johnson, 2004). It is easier to find compromises, which relate to the need to strike a balance between safeguarding people's safety and combatting crime, while upholding citizens' rights, freedoms and guarantees (Amankwaa, 2018

the Panorama of forenSiC GenetiC DatabaSeS in euroPean CountrieS
The size of forensic genetic databases and their type of organization and regulation is highly varied. Legislation may state the possible purposes or uses of DNA databases, distinguishing between criminal identification, civil identification and scientific research purposes. It can also establish the scope and means of access to the information held in the database; for example, whether all authorities (judicial authorities or police forces) have access or whether access is restricted to certain agents of the justice system. Or whether only information about matches between genetic profiles may be communicated or if other information can also be communicated (e.g., personal data relating to the person identified by means of the genetic profile).
Other issues that are usually determined in national legislation are those related to the criteria for insertion and removal of genetic profiles and biological samples. Different options exist in the legislation of different countries that determine the scope and extent of access to the DNA database based on criteria such as the type of crime committed, the maximum duration of the potential sentence, the individual's characteristics and the likelihood of recurrence. As a result, the law is expected to respond to the following questions: which individuals and under what circumstances shall profiles be inserted into the DNA database? What is the fate of biological samples collected from suspects or convicts? What are the deadlines for retention of DNA profiles and samples?
In general terms, the criteria governing the insertion and removal of profiles and samples constitute the variable that will have the most significant impact on the size of databases of genetic profiles. According to Filipe Santos and colleagues, who carried out a study on legislative trends in DNA databases in Europe, there are countries with expansive legislation and others with restrictive legislation (Santos et al., 2013). According to this typology, the countries with restrictive legislation are Germany, Belgium, Spain, France, the Netherlands, Hungary, Ireland, Italy, Luxembourg, Portugal and Sweden, whereas the countries with expansive legislation are Austria, Denmark, Scotland, Slovakia, Estonia, Finland, Latvia, Lithuania and the UK (England, Wales).
According to the authors, if a specific law has few constraints (e.g., the inclusion of the DNA profile of any individual suspected of any punishable offence) on the insertion of profiles into the DNA database for forensic purposes (whether a suspect or convicted person), the country may be designated as having an expansionist tendency with respect to the development of such databases. By contrast, countries with a restrictive tendency are those whose legislation currently contains various constraints that restrict and limit the uses of DNA databases-for example, the imposition of limits on the types of sentences or crimes eligible for the insertion of profiles.
It should be noted that the apparent dichotomy between the expansionist and restrictive tendencies refers to the potential specific effects of legislative provisions. These effects are reflected, for example, in the proportion of the national population present in the database of each country. Table 5.1 shows the size of several forensic genetic databases in Europe. It should be pointed out that although it presents a type of "restrictive" legislation, there has been remarkable expansion over recent years, and it now occupies the third largest forensic genetic database in Europe. The database of genetic profiles in England and Wales remains the largest of all, notwithstanding recent legislative changes in the wake of the decision of the European Court of Human Rights (ECHR) following S. & Marper v. UK 1 (McCartney, Williams, & Wilson, 2010), which ordered the destruction of biological samples and the elimination of DNA profiles of 1 S & Marper v. UK refers to a complaint lodged with the European Court of Human Rights by two individuals (S, an 11-year-old child, and Marper) against the UK. Both S. and Marper were detained in unrelated circumstances in 2001, and their fingerprints and DNA samples were collected. No accusations resulted from the arrests, which led them to ask the Chief Constable to eliminate the records. The requests were denied. After appeals against the Chief Constable's decision to the courts and the House of Lords, it was determined that although individuals had not been charged with any crime, and despite the possible breach of privacy, fingerprint retention and DNA profiling was considered to be beneficial to society (McCartney et al., 2010). The ECHR's decision went the other way, and determined that the retention of fingerprints and DNA profiles of suspects who haven't been convicted constitutes a "disproportionate interference" with individuals' rights to privacy and "cannot be taken for granted in a democratic society" (Council of Europe, 2008, par. 125). acquitted suspects or persons who haven't been accused of any crime (Amankwaa & McCartney, 2019). Despite legislative differences in European DNA databases, the dominant trend towards their generalization and more harmonized sharing of information has been increasingly encouraged, based on the common threat of cross-border crime and terrorism. After the implementation of the Prüm Decisions (EU Council, 2008a, 2008b, in particular the parts related to sharing of information from DNA databases may lead to the need for further legislative harmonization in the various EU countries-a topic that will be further explored in Chap. 6. Given the diversity of the criteria for insertion and removal of DNA profiles and preservation of samples, it is difficult to ensure compliance with the principles of equality, proportionality and presumption of innocence in the context of the transfer of information about DNA profiles between the Member States. For example, the apparent insufficiency of a policy of standardization and monitoring of processes related to cooperation activities, and also the collection, retention, processing, interpretation and legal application of information on DNA profiles, within the framework of the planned measures (Amankwaa, 2019;McCartney, Wilson, & Williams, 2011;Santos & Machado, 2017;Toom, 2018).

biG Data in the Criminal inveStiGation
The topic of Big Data has gained increasing visibility in the public arena and academic studies. It is generally understood to be a phenomenon which, using digital technology, collects, stores and analyses data from various sources for certain specific purposes. A popularized assumption regarding Big Data is that its essence might be defined by using three "V's": volume, velocity and variety. Other characteristics can also be listed: Big Data refers to data sets with a high level of completeness (e.g., covering entire populations) that contain contextual information that can identify concrete and specific situations (e.g., instead of identifying groups or types of people, it makes it possible to identify specific persons). Also, such data sets are relational (i.e., they make it possible to compare data derived from different sources) and flexible (they can incorporate new data at any moment) (Chan & Moses, 2015;Kitchin, 2014aKitchin, , 2014b. From a sociological perspective it is crucial to address Big Data as a cultural, social and political phenomenon (Boyd & Crawford, 2012), which encompasses the following dimensions, as defined by Janet Chan and Lyria Bennett Moses: (1) Technology: maximizing computation power and algorithmic accuracy to gather, analyze, link and compare large data sets. (2) Analysis: drawing on large data sets to identify patterns in order to make economic, social, technical and legal claims. (3) Mythology: the widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity and accuracy. (Chan & Moses, 2015, p. 24) The "mythological" aspect associated with Big Data finds similarities in the social imaginaries associated to forensic genetics, and is also liable to generate expectations of producing irrefutable truths in the identification of perpetrators (Lynch, Cole, McNally, & Jordan, 2008). This type of social expectation concerning Big Data opens the doors to expansion and reinforcement of surveillance practices, which will henceforth take on specific new contours while reproducing "old" practices.
A central aspect of Big Data's implications for criminal investigation concerns the predictive and anticipatory nature of risk. This aspect of Big Data reinforces a trend that is already seen in the creation and expansion of forensic genetic databases, as described in earlier sections of this chapter. Big Data therefore emerges as a reinforcement of the trends towards foreseeing and anticipating risk: through massive quantification and new possibilities for rapid cross-checking of data from sources that until recently have been dispersed, such as the proliferation of automatic alert systems which, on an unprecedented scale, monitor people who have never had any contact with the criminal justice system (Brayne, 2017).
In the framework of criminal investigation, Big Data can, therefore, act as a means of generating intelligence for criminal investigation, making it possible to quantify assessment of risk and classify individuals according to their degree of risk. For example, Big Data techniques can serve to determine the risk that specific individuals will commit a crime or terrorist act (Ball, Di Domenico, & Nunan, 2016;Lyon, 2014). Quantification of the level of risk presented by certain individuals means that Big Data reinforces the surveillance of social groups and individuals who are more vulnerable to police suspicion, thereby consolidating social mechanisms of stigmatization and reproduction of social inequalities (Brayne, 2017;Kitchin, 2014b;Matzner, 2016;Raley, 2013).

ConCluDinG remarkS
In the context of this chapter, DNA databases and Big Data techniques are both seen as processes through which new and effective social control modalities have been configured. Such processes are associated with political and governmental strategies for crime prevention and control, in societies that are increasingly less tolerant of "suspicious" citizens and willing to adopt more intensive regimes of social control, regulation and inspection. The analysis in this chapter is underpinned by the understanding of the concept of surveillance as the streamlined control of information in modern organizations intertwined with capitalist production and consumption systems and with the bureaucratic functioning of the State (Haggerty & Ericson, 2000;Lyon, 2004Lyon, , 2014.

referenCeS
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.