One of the consequences of excluding a great deal of the collaborative genome efforts that proliferated in the 1980s and 1990s from the success story of genomics has been the assumption that human genomics corresponded to a single initiative or entity. This assumption portrays the Human Genome Project as one international endeavour that started and ended at defined dates, presented a set of stable participants, and operated according to a predefined plan: the large-scale production of a reference sequence of the whole human genome. The narrative of a single human genome effort consolidated in June 2000, when a consortium of funders, sequencing centres and bioinformatics institutions from Europe, Asia and North America presented a first draft of the full sequence of Homo sapiens in a ceremony chaired by the US president, Bill Clinton, and attended remotely by UK Prime Minister, Tony Blair (Chap. 4). Before—and contemporaneous to—this announcement, a number of multinational genome initiatives to sequence yeast (Saccharomyces cerevisiae), the fruit fly (Drosophila melanogaster) and the thale cress (Arabidopsis thaliana) were unfolding with substantial leadership from the European Commission (Chap. 2).

The draft human genome sequence was published in the journal Nature in 2001. This article referred to the sequencing effort as the “Human Genome Project” and defined this project as an “international collaboration” that had started in 1990 and was scheduled to conclude with the release of a more final sequence, which appeared in a follow-up publication, also in Nature, in 2004 (International Human Genome Sequencing Consortium, 2001, pp. 860 and 862; 2004). Since then, press coverage, popular literature and a substantial amount of academic scholarship have depicted a single, international Human Genome Project.Footnote 1 The depiction of the role of the European Commission (EC) as a funder and broker of genomic endeavours has tended to be restricted to yeast sequencing and presented as an antecedent to the Human Genome Project. As we discussed earlier in the book, this consideration of S. cerevisiae as a pilot or model platform for human sequencing aligns more with the US yeast genome effort than with the EC one. The EC, rather, selected yeast as an industrially-significant organism that would foster economic growth and scientific collaboration across its member-states (Chap. 2; see also Parolini, 2018).

In this chapter, we continue augmenting the historical landscape of genomics and de-centring it beyond the production of a human reference sequence. We start by arguing that instead of a monolithic Human Genome Project (with capitals H, G and P), a plethora of national and international human genome initiatives co-existed from the mid-1980s onwards with different rationales, spokespersons and funding regimes.Footnote 2 As late as 1996, the strategy of tackling the whole human genome via the concerted action of a handful of large-scale sequencing centres was not yet dominant. Contemporary historical accounts (e.g., Cook-Deegan, 1994, Part Three) document that only one national initiative unambiguously sought, from the onset, to produce a physical map and a reference sequence of the entire human genome: the joint programme of the USA’s Department of Energy (DoE) and National Institutes of Health (NIH). Due to this, a widely accepted meaning of the capitalised phrase ‘Human Genome Project’ during most of the 1990s was just the US national effort, which itself adopted that name.

We designate the US national programme throughout this chapter as ‘US-HGP’. It formally commenced in 1990, when some other national human genome projects were already underway, and had as a defining characteristic the concentration of NIH and DoE funding in a series of centres that specialised in various aspects of genomics, such as physical mapping, large-scale sequencing, bioinformatics or technology development (Hilgartner, 2017, Ch. 2 and pp. 91-110). Some of these centres already existed and were devoted to other types of research, such as medical genetics or, in the case of those supported by the DoE, the effects of radiation on DNA. Others were created de novo to comprehensively sequence the human genome and those of pilot organisms, such as yeast (Chap. 2). All of the centres were committed to the objective of full genome mapping and sequencing, a feature that distinguished the US-HGP from many other contemporary human genome programmes. As we highlight, a leading architect in the design of the new centres and advocate of their whole-genome approach was the Nobel Prize-winning molecular biologist—and co-discoverer of the double helical structure of DNA—James Watson, who led the NIH arm of the US-HGP until 1992.

Among national human genome programmes, the objective of mapping and sequencing the whole human genome was unique to the US-HGP, as was the prominence of a leader such as Watson. The main insights of this chapter stem from comparing the US programme with another, less well-known national initiative: the UK Human Genome Mapping Project (HGMP). Launched one year earlier, in 1989, and funded by the British Government through its Medical Research Council (MRC), the HGMP did not create large-scale sequencing centres. As with many other emergent human genome projects, its strategy aligned with the distributed, network approach that the EC was forging for the sequencing of yeast (Chap. 2).

The HGMP enabled the MRC to secure funds from the UK Treasury for a Directed Programme of grants specifically tailored to map and sequence human DNA. The recipients of those grants were laboratories in the fields of human genetics and, especially, medical genetics. Those recipients and the ways they aimed to tackle the human genome were key differences between the British programme and the US-HGP. Rather than promoting a new breed of whole-genome-oriented practitioners, as the DoE and the NIH were fostering, the HGMP funded and coordinated research groups that kept working on specific parts of the human genome. In other words, the communities of genomicists that constituted each programme differed. Although the beneficiaries of HGMP grants collectively produced map and sequence data across the genome, they retained their individual identity as specialists in diseases or biological phenomena affecting only certain genome regions. Conversely, the specialism of the DoE and NIH-funded genomicists increasingly tended towards the large-scale mapping and sequencing of the entire human genome.

The HGMP beneficiaries used their grants to develop mapping and sequencing methods aimed at positioning, within human chromosomes, DNA fragments encompassing genes or gene markers associated with diseases, or any other biologically or medically relevant characteristic. They were assisted by a resource centre that the HGMP established as both a technological hub and a repository of the genomic data produced by the laboratories in receipt of Directed Programme funding (Balmer, 1998; Glasner, 1996). Apart from providing technical support and advice to the HGMP-funded laboratories, the resource centre pooled their mapping results and compiled them in databases.Footnote 3 It also conducted partial sequencing of the mapped DNA fragments, particularly the regions corresponding to genes that were thought to be involved in the genetic diseases that the HGMP laboratories investigated. This work was developed in collaboration with gene-specific sequencing groups sponsored by the Human Genome Analysis Programme, an initiative that the EC launched in 1990 that followed the distributed model it had just implemented for the yeast sequencing project (Chap. 2).

The HGMP Resource Centre differed from the US-HGP genome centres in two key aspects: (1) it fulfilled a service role and conducted mapping and sequencing work at the request—and based on the results—of the Directed Programme-funded laboratories rather than comprehensively sequencing at its own initiative; and (2) the map and sequence data it compiled represented only the areas of interest of the contributing laboratories and was thus not intended to be a complete representation of the whole human genome. As we argue, out of the HGMP, HGAP and other groupings of human and medical geneticists—such as the chromosome mapping workshops—a community of genomicists emerged, one that was larger and more diverse than the one working at the genome centres conducting the US-HGP.

This contrast enables us to conclude that a key factor distinguishing the US-HGP from the HGMP, and more generally from the distributed approach promoted by the EC, was in the assemblage of the research communities and funding regimes underlying each of them. In the case of the US-HGP, this assemblage embodied Watson’s vision, his circle of influence and the joint funding provision of the DoE and NIH. Watson was a founder of molecular biology and, from the late-1960s onwards, had been instrumental in structuring this community from his position of director of Cold Spring Harbor Laboratory (CSHL). One of the pillars in this structuring process had been fostering the shared belief in the mechanistic action of genes and the community’s commitment to detailed investigations of model organisms. It was hoped that a full molecular description of those organisms would unveil the role of genes in a myriad of biological processes.Footnote 4

As we showed in the previous chapter, the worm Caenorhabditis elegans had become one such model organism. It was at CSHL where, in 1989, Watson met John Sulston, Alan Coulson and Robert Waterston, and instigated the start of the worm’s sequencing project. The designated host institution of the project in the USA, Washington University, subsequently inaugurated a genome centre and undertook the sequencing of two other organisms: yeast and H. sapiens. This intensive and scaled-up approach differentiated the genome centres from the distributed model that the EC was promoting in its sequencing programmes (Chap. 2). The status of C. elegans and yeast as model organisms was one of the reasons that led Watson to regard them as suitable pilot platforms to inform the mapping and sequencing of the human genome. He did not hesitate in adopting the same genome centre model when the US-HGP—a self-contained, national initiative as opposed to the multi-country programmes of the EC—sponsored the comprehensive human genome sequencing effort. Under Watson’s leadership, the US-HGP was the vehicle for producing a reference sequence from which the connections between genes and biological properties—implicating evolution, health and disease—could later be drawn.

The HGMP was also promoted by a founding figure of molecular biology: the proponent of C. elegans as a model organism, Sydney Brenner. Yet the MRC, partly due to the size of the UK relative to the USA, lacked the resources to launch a whole-genome initiative on its own. This led Brenner and the MRC to look at the communities of human and medical geneticists as possible allies to execute the project. Unlike molecular biologists, these communities were interested in variation rather than comprehensive standard descriptions as an entry point into investigating gene function. Consequently, their motivation to tackle the human genome was not achieving a complete reference sequence, but using the reference sequence data as a scaffold to aid in the determination of variants associated with diseases or evolutionary traits. Identifying and investigating variation, as opposed to establishing a canonical reference sequence, was thus a driving force behind the organisation of the HGMP and its indifference to adopting whole-genome approaches. From the viewpoint of the many laboratories supported by the HGMP Directed Programme, focusing on specific genome regions that could be compared with either other organisms or between patients suffering a genetic condition and non-sufferers was far more useful than mapping and sequencing the entire human genome.Footnote 5

In what follows, we show that the differences in the funding systems, organisational models, communities and genomicists involved in the HGMP and US-HGP assemblages point to a diverse landscape. This diversity is difficult to grasp from a perspective that narrowly focuses historical inquiry on the human reference sequence published in 2001 and 2004. What is now associated with a single, coherent and successful Human Genome Project represents just one route through complex historical terrain. Our ability to identify the web of pathways that criss-cross this terrain enables us to extend our historical interrogation from yeast to H. sapiens. The multiplicity of both parallel and interwoven lineages in the development of the HGMP and US-HGP indicates that the historical landscape was as heterogeneous in human as in non-human genomics. By looking both within and beyond human genomics, we can highlight the factors that led to the increasing prominence of the human reference genome. This enables us to assess its significance in a fresh light, while at the same time preventing it from narrowing our vision.

1 The Exception Rather Than the Rule

In 1988, Watson supplemented his CSHL directorship with a new role as associate director of the freshly-established NIH Office for Human Genome Research. He had held the CSHL position since 1968—15 years after co-elucidating the DNA double helix and 6 years after receiving the Nobel Prize—and transformed this institution into the most influential forum of molecular biology. CSHL held annual symposia in which the invitees, considered to be the international elite of molecular biologists, would discuss pressing scientific challenges. The 1986 symposium had been devoted to the Molecular Biology of Homo sapiens and became one of the first settings in which the feasibility of mapping and sequencing the human genome was assessed. The enormous size of the human genome—three billion DNA nucleotides compared to the 12 million of yeast and 100 million of C. elegans—made the viability and utility of the enterprise a matter of debate within and outside the CSHL meeting. In his 1988 CSHL director’s report, Watson expressed concerns about his increased responsibilities and the stress of commuting. He considered, however, that the remit of the new NIH Office—implementing a national human genome programme in the USA—represented a one-time “opportunity”. From this new position, Watson could let his scientific life “encompass a path from double helix to the three billion steps of the human genome”.Footnote 6

Watson’s commitment to the sequencing of the human genome was shared by scientists and administrators at the DoE. However, rather than completing the molecular description of DNA—from ascertaining the double helix to laying out its nucleotide sequence—what the DoE human genome advocates sought was to build on a longstanding tradition of investigating the genetic effects of radiation. This line of research had started after World War II, following the dropping of the atomic bombs and their devastating medical effects on local populations in Hiroshima and Nagasaki (Lenoir & Hays, 2000; Lindee, 1994). It had led to the reorientation of some of the personnel and research programmes of DoE-funded laboratories from physics to the life sciences. An example of this was Los Alamos National Laboratory, which after playing a leading role in the wartime race to develop the atomic bomb—it was the home of the flagship Manhattan Project—devoted a growing proportion of its mathematics expertise and computing resources to solve biological and medical problems.

Due to this, Los Alamos was chosen as the institution that would host the first centralised DNA sequence database in the USA—GenBank—in 1982 (Strasser, 2019, Ch. 5). A few months prior to the 1986 symposium at CSHL, DoE representatives organised a workshop in Santa Fe and subsequently announced a pioneering programme called the Human Genome Initiative.Footnote 7 As a result of this, the biomedical lines of research at two other DoE-sponsored institutions, the Lawrence Berkeley and the Lawrence Livermore National Laboratories, were strengthened and largely channelled towards technology development and genome-wide mapping and sequencing of human DNA. That the DoE network of national laboratories was equipped with personnel and infrastructures to conduct big science endeavours was a competitive advantage that favoured their early leadership in the incipient human genome work in the USA.Footnote 8

How the DoE initiative converged with the NIH effort has been amply described in the literature (Cook-Deegan, 1994, Part Three; Hilgartner, 2017, pp. 91-110). In 1988, two reports issued by the US National Academy of Sciences and the Office of Technology Assessment recommended a single national initiative that would initially focus on physical mapping and improving the existing instrumentation to create a platform for sequencing the human genome in the longer term. This led the DoE and NIH to merge their endeavours into the US-HGP, a 15-year programme that was launched in 1990 with a three billion dollar budget that was contributed towards by both agencies, the former through the Office of Health and Environmental Research and the latter through the National Center for Human Genome Research, an expanded version of Watson’s Office that was later renamed as the National Human Genome Research Institute (NHGRI).Footnote 9 The explicit goal of the US-HGP was to produce a physical map and reference sequence of the whole human genome by 2005.

What we want to stress concerning the history of the US-HGP is how this initiative, and other contemporary human genome projects, disrupted the funding and organisational regimes of biomedicine. This disruptive effect has already been noted by scholars who have investigated the impact of big science and data-intensive approaches on different areas of contemporary biological and medical research (Leonelli, 2016; Stevens, 2013; Vermeulen, 2016).Footnote 10 With regard to genomics, Stephen Hilgartner has argued that it propelled a new “knowledge-control regime” that was distinct from existing disciplines, such as molecular biology. This regime constituted new categories of “agents, spaces, objects and relationships”, and allocated to them “entitlements and burdens” that led to novel ways of conceiving and disseminating knowledge (Hilgartner, 2017, p. 9).

Hilgartner’s empirical work has focused on the US-HGP as an exemplar of new players—the genome centres—and new rules for processing, storing and sharing the data they produced. Crucially, the emergence of the knowledge-control regime of genomics was neither immediate nor uniform. It occurred gradually throughout the 1990s, with more intensity in some parts of the world than in others. The rest of this chapter emphasises the gradualism of the transformation within the US-HGP, and how other human genome programmes adopted different knowledge-control regimes. Some of these alternatives to the US-HGP, we argue, never converged with what Watson and his DoE colleagues advanced.

A challenge that the NHGRI faced was in transforming the funding culture of the NIH into a system that would enable large-scale mapping and sequencing. Like many other biomedical funders, NIH managers and administrators were used to issuing competitive calls for proposals and awarding grants across relatively large numbers of laboratories, following peer review of their applications. This differed from the DoE model, which rather than running a responsive grant mechanism would distribute their budget among a narrower cohort of recipients: its network of national laboratories. The DoE funding system had allowed the creation of a number of genome centres that prioritised the production of map and sequence data via the development of high-throughput technologies and the deployment of industrial modes of production. These genome centres were based in some of the DoE laboratories and had begun operating during the preceding Human Genome Initiative. Although Watson could not exempt the NHGRI from the NIH grant-award system, he established different, specific criteria when distributing US-HGP funds with the aim of fostering a similar type of operation to the DoE one.

The main criterion for NHGRI grants was whether the applicants and their home institutions could contribute to the establishment of a solid base of whole-genome mapping and sequencing centres. With this, Watson sought to avoid what he labelled the “cottage industry” approach, which he attributed to the sequencing of microorganisms (Watson, 1990, p. 45). This approach consisted in the formation of large inclusive consortia and required the distribution of resources as widely as possible among the communities working on the organisms to be characterised. Watson’s attribution of “cottage industry” was initially aimed at the sequencing of the bacterium Escherichia coli, but as the 1990s progressed, the EC’s Yeast Genome Sequencing Project emerged as the most widely cited example of cottage industry genomics (e.g. Palca & Roberts, 1992, p. 957).

For Watson, the cottage industry approach presented several logistical problems when applied to larger genomes. Instead, the NHGRI sought to gradually form a small set of funding recipients with industrial mapping and sequencing capacities that were not necessarily interested in conducting research using the resulting data. This change of ethos, however, did not become fully implemented in the USA until the mid-to-late 1990s, partly due to the resistances it encountered among some quarters of the genetics community.Footnote 11

During the early days of the US-HGP, the NHGRI administrator in charge of distributing genome mapping grants was Jane Peterson. She worked hard to persuade laboratories equipped with the appropriate technologies and expertise to broaden the genome areas they would tackle. Some of these laboratories featured long-established teams of medical geneticists that had historically focused on smaller regions of human chromosomes encompassing genes or genetic markers connected to diseases.Footnote 12 Examples of this were Victor McKusick and Frank Ruddle’s groups, at Johns Hopkins University and Yale University respectively. These two scientists (Fig. 3.1) had pioneered the chromosome mapping workshops, forums at which geneticists from all over the world shared their mapping results.

Fig. 3.1
Two photographs. a,Two elderly men in glasses sit beside each other. b, A group of people stand in a group, where one them speaks while the others listen to him.

Victor McKusick (left photograph, second seated from left) with fellow medical geneticist P. S. Gerald; and Frank Ruddle (right photograph, standing wearing a white shirt) surrounded by, among others, G. J. Darlington and R. S. Kucherlapati. They were all attending the first chromosome workshop, held at Yale University in 1973. Both pictures from: New Haven Conference (1974, pp. 209 and 211); copyright © 1974 Karger Publishers, Basel, Switzerland

Started in 1973 and continued annually or biennially until the release of the human reference genome, these workshops produced human genome maps with increasing numbers of genes and markers on them, and at improved resolution (Fig. 3.2).Footnote 13 They achieved this through the collation of multiple partial results: those reported by individual genetics groups working on a specific disease or set of diseases at given chromosomal locations. By collectively gathering and pooling these results, the workshops gradually covered broader areas of the chromosomes and populated them with an increased number of landmarks (Jones & Tansey, 2015). In 1987, building on the success and consolidation of this model, McKusick and Ruddle co-founded Genomics, a journal devoted to the publication of mapping results (Kuska, 1998; Powell et al., 2007, pp. 13ff). Yet in order to achieve the US-HGP goals, the NHGRI needed to fund institutions—rather than collectives—whose mapping went well beyond the contributions to the chromosome workshops or the results published in the articles of Genomics.

Fig. 3.2
A genetic linkage map and physical map of human chromosome 18. Certain genetic maps are highlighted with several branches of its corresponding R H maps.

Part of the genetic linkage map and physical map of human chromosome 18, as reported in the Fourth International Workshop devoted to its mapping, held in Boston (USA) in 1996. The genetic linkage map is displayed on the left of the picture and labelled as “Genetic map”, with the physical “RH map” arrayed next to it (Radiation Hybrid—RH—maps are a form of physical map). From: Silverman et al. (1996), p. 119; copyright © 1996 Karger Publishers, Basel, Switzerland

On the sequencing front, the NHGRI initially funded a small number of individual grants aimed at model organisms with relatively small genomes, such as E. coli, C. elegans, and D. melanogaster, as well as a number of yeast (S. cerevisiae) chromosomes (Chap. 2). Some, but not all, of these grants were among the first set of genome centre grants funded in 1990. Strategically, not only were those grants intended to contribute towards the completion of the sequences of their target organisms but, more importantly in the long term, to act as platforms for technology development and the creation of the infrastructures for the establishment of sequencing centres. In 1996, the NHGRI awarded a set of six grants as pilots for human genome sequencing; these projects had a minimum target of sequencing one Megabase (one million nucleotides, or bases) of human DNA. With the information and experience gained in these pilots, the NHGRI scaled up its sequencing programme in 1999 with the funding of three Genome Sequencing Centers at Washington University, Baylor College of Medicine and the Whitehead Institute (of the Massachusetts Institute of Technology and Harvard University). At all three of these sites, the sequencing centres were outgrowths of previously funded genome centres and pilot sequencing projects.Footnote 14

A defining characteristic of those sequencing centres was that their funding and organisation prioritised the completion of their target genomes over any other scientific or medical objective, including the mapping of genes or markers associated with diseases. This form of operation was difficult to deploy beyond the USA. For example, in most European countries, governments had neither the resources nor the motivation to create specific grants for large-scale genome mapping and sequencing at dedicated centres. Private and charitable funds, by contrast, had fewer constraints and could be more easily channelled to a particular enterprise or group, as opposed to having to support a wider scientific community. This was the case for the Wellcome Trust, a British charity that teamed up with the MRC in 1992 to create the Sanger Institute, an institution that substantially contributed to the completion of the yeast, C. elegans, human and pig genomes (Chaps. 4 and 5). Another example of a charitably-funded genome centre was Généthon, supported by AFM-Téléthon, the French Muscular Dystrophy Association. Established in 1990, this institution was devoted to comprehensive mapping and quickly became a world leader in the production of genetic and physical maps encompassing the entire human genome.Footnote 15 Généthon combined whole-genome work at its own initiative with a service role, attending to mapping requests from the French medical genetics community. This service role differentiated it from the US sequencing centres (Jordan, 1993, pp. 131ff; Kaufmann, 2004).

In spite of their influence, Généthon and the Sanger Institute were exceptional cases outside the USA. The US-HGP was rather unique in its commitment to full human genome mapping and sequencing when compared to programmes introduced by other governments, especially in Europe.Footnote 16 Those other programmes did not distinguish the human genome work they sponsored as sharply from medical genetics research as the US-HGP did. For this reason, they refrained from focusing on producing a reference map and sequence of the full human genome and were closer to the distributed, networked organisation that the EC was implementing in its sequencing projects. This distributed form of organisation was more suitable for fostering communication and tailoring the genome work to the regions of interest of the local medical genetics communities. The British HGMP was one of the earliest examples of this way of approaching the human genome.

2 The UK Human Genome Mapping Project

In 1989, one year before the launch of the US-HGP, the British Government authorised the release of 11 million pounds to fund the HGMP, a three-year programme that would be managed by the MRC.Footnote 17 The key proponents of this initiative were Brenner, a senior scientist who had just left the Laboratory of Molecular Biology of Cambridge (LMB) after a successful 30-year tenure, and Walter Bodmer, a reputed geneticist who coordinated the research laboratories of the medical charity Imperial Cancer Research Fund (ICRF).Footnote 18 Keith Peters, a practising physician with ample experience in teaching and researching immunology at London’s Hammersmith Hospital, had presented the HGMP proposal on Brenner and Bodmer’s behalf to the Advisory Committee on Science and Technology (ACOST). This body directly reported to the UK Prime Minister—in this case Margaret Thatcher—on projects that were likely to generate impact and required rapid funding. It approved the HGMP on Peters’ recommendation and transferred the funds in less than one year (Balmer, 1996).

The prime mover behind the HGMP was Brenner. He had moved to Cambridge (UK) in 1956 to begin his research career, having recently concluded his PhD. Watson had also moved to Cambridge at the same stage in his career and returned to the USA the same year Brenner arrived in the UK. Brenner became the main collaborator of physicist-turned-biologist Francis Crick, who had successfully worked out the structure of DNA with Watson. Up to the early-1960s, Crick, Brenner and Watson focused on what became known as the coding problem: how the order of the nucleotides comprising DNA affects the synthesis of specific proteins that are responsible for most of the structural and functional aspects of the living cell (de Chadarevian, 2002, Part II; see also Kay, 2000).

In 1962, the same year Watson and Crick were awarded the Nobel Prize, the LMB was founded as an MRC-supported institution that would host an increasingly influential group of biologists in Cambridge. Crick became the director of the LMB Division of Molecular Genetics and Brenner started a long-term line of research, adopting the nematode worm C. elegans as a model to investigate the genetics of development and behaviour. This enterprise sought a detailed description of the worm’s neuron circuitry, as well as its development from embryo to adult, with the hope of finding the “programme” that connected brain activity and cell differentiation to particular C. elegans genes.Footnote 19 The project included crossing experiments in which Brenner attempted to produce mutant worms and identify specific genes associated with variation in properties such as size or mode of movement, as geneticists had done with the fruit fly Drosophila and other organisms. Brenner also recruited more junior associates that would carefully detail the fates of every single cell throughout the C. elegans life cycle—its cell lineages—and the position and synaptic connections of each neuron in its brain.Footnote 20 To this end, John Sulston joined the LMB in 1969 to chart the multiple divisions of cells during the worm’s embryonic and post-embryonic development (de Chadarevian, 1998).

By the time Brenner first proposed to map the human genome, in 1986, the worm project was experiencing a profound transformation. The description of cell lineages and brain connectivity had been completed by the early-1980s and a project to construct a physical map of its genome had started under the leadership of Sulston and Alan Coulson (Fig. 3.3). Coulson was a research assistant who joined the team after working at another LMB division on the development of early DNA sequencing techniques. Brenner, however, was becoming increasingly sceptical about the possibility of matching the detailed information his team had gathered about cell divisions and synaptic transmissions in C. elegans to the genes Sulston and Coulson would identify in their map, given the complexity of developmental processes in multicellular organisms (Lewin, 1984).

Fig. 3.3
Two photographs. a, Two men in a lab, one seated and the other standing. Both are smiling. b, A man is seated at a desk and writes on a piece of paper with a pencil. A microscope is placed on the same table, with a paper pinned to the board behind him.

Left, Sydney Brenner with co-discoverer of the double helical structure of DNA, Francis Crick, at the Laboratory of Molecular Biology of Cambridge in 1962. Right, John Sulston holding a section of the physical map of C. elegans around 1985 (pictures of the nematode worm are pinned to the wall behind him). Copyright of left image: Hans Boye/MRC Laboratory of Molecular Biology. Copyright of right image: MRC Laboratory of Molecular Biology. Both reproduced with permission

Partly because of this, in the same year of his human genome map proposal, Brenner left the LMB and established a Molecular Genetics Unit that, despite being also supported by the MRC, was part of the School of Clinical Medicine of the University of Cambridge. In this Unit, Brenner continued some work on the genetics of C. elegans but left the physical map to Sulston and Coulson, who remained at the LMB. The other lines of research in Brenner’s Unit were the development of genome mapping technologies and “certain aspects of gene evolution”.Footnote 21

Brenner’s proposal was entitled “A physical map of the human genome” and it was submitted in November 1986 to the Cell Board, the body of the MRC that funded genetics research. In his case for support, he argued that it was by then “not clear” whether the resources needed for a “central facility” to sequence the entire human genome “would ever be made available”. This led Brenner to advocate for the construction of a physical map not only as a “first necessary step towards the grander sequencing proposal, but also for the more immediate benefits” it could bring “to medical research and practice”. Brenner’s vision started with a laboratory that would “carry out” the mapping programme and “act as the reference centre for human genetics”. A “central concept” of his strategy was to establish “cooperative links and not enter into competition with individual research projects”. In this regard, Sulston and Coulson’s ongoing physical map of C. elegans provided a “useful benchmark” for Brenner’s intended human mapping enterprise.Footnote 22

At the time of this proposal, Brenner was serving on the committee of the National Academy of Sciences that advised the US Government on the plausibility and best strategy for conducting a human genome project. By late 1986, the discussions were still nascent and the model of tackling the entire human genome at dedicated and comprehensive mapping and sequencing centres had not yet attained majority support. Nevertheless, this comprehensive and concentrated strategy was gaining momentum in the USA. The physical mapping exercise that Brenner envisaged for the UK and the reference laboratory that would execute it differed in many respects with what became the US-HGP.

First, and contrarily to Watson, who also served in the committee, Brenner did not support a whole-genome operation. For Brenner, the size of the human genome—30 times bigger than C. elegans—meant that a comprehensive mapping and sequencing initiative would yield a substantial volume of data that would not correspond to genes. Biomedical scientists were well aware that only a small fraction of human DNA constituted genic regions, i.e., those directly involved in the synthesis of proteins. By the mid-to-late 1980s and early-1990s, a large proportion of those scientists—especially within the human and medical genetics communities—regarded the remainder of the genome as ‘junk DNA’: repetitive sequences that were expected to be non-functional.Footnote 23 Based on this common wisdom, Brenner argued that mapping and sequencing the entire human genome was not a worthwhile enterprise (Brenner, 1990). He, however, maintained his commitment to detailed descriptions of organisms that, due to their simpler developmental processes, could be used to model the molecular basis of life properties.Footnote 24

Secondly, the reference laboratory that would channel Brenner’s genome project was conceived to operate at the behest of human and medical geneticists. This was largely due to the framing of his proposal against the background of the ongoing physical mapping of C. elegans. Since 1983, Sulston and Coulson had mapped ever-increasing areas of the worm’s genome by fulfilling requests of laboratories working on specific C. elegans genes. This had been mutually beneficial and ensured that the mappers were regarded as important, foundational members of the C. elegans research community: Sulston and Coulson crucially contributed to the objectives of this community, while increasing the resolution of their physical map (García-Sancho, 2012). The genome centres that Watson established for the US-HGP lacked this community service role: they mapped and sequenced comprehensively, at their own initiative rather than addressing requests from other laboratories. Although the genomes of C. elegans and S. cerevisiae were part of the remit of these large-scale centres, the US-HGP approached the mapping and sequencing of both organisms as a means of easing the path to human genome work rather than engaging with the research necessities of worm and yeast biologists.Footnote 25

Thirdly, and as a consequence of the above, Brenner’s project sought to involve the existing human and medical genetics groups rather than creating a new community of genome centres and specialist genomicists. After receiving Brenner’s proposal, the MRC sounded out the opinion of reputed scientists and institutions in search of arguments for approval or rejection, as well as possible sources of co-funding. One of Brenner’s first allies was Bodmer, who belonged to a group of geneticists that in the 1960s and 1970s had pioneered the mapping of a region of the human genome called the Human Leukocyte Antigen system (HLA).Footnote 26 This region contains densely-packed and hypervariable genes implicated in the immune response to infection; the variability of many of these genes aided their mapping (Löwy, 1987; see also Heeney, 2021). In his role of director of research at ICRF, which he took up in 1979, Bodmer equipped the charity’s laboratories with cutting-edge DNA mapping and sequencing technologies (Weston, 2014, esp. Chs. 2-4). Another supporter of Brenner’s proposal was Peters, who in 1987 moved from Hammersmith Hospital to the University of Cambridge due to his appointment as Regius Professor of Physic and Dean of the School of Clinical Medicine. From this position, he oversaw the establishment of Brenner’s Unit in the school and saw human genome mapping as an opportunity to connect genetics research with medical goals.Footnote 27

Peters had also become life sciences adviser in ACOST and suggested this committee—directly reporting to the Prime Minister’s Office—as a potential source through which the MRC could obtain the necessary funding for the human mapping project. In 1988, he formally endorsed Brenner’s proposal and presented it to an audience that included Thatcher and her chief scientific advisor. He emphasised his experience as a practising physician and argued that the resulting physical map would become “the central tool for basic and applied research in the medical sciences”.Footnote 28 ACOST agreed to support the initiative, which was subsequently named as the HGMP. This support materialised in an extra 11 million pounds that the Treasury transferred to the MRC as an earmarked fund to be exclusively spent in a Directed Programme of grants and a Resource Centre for human genome mapping. The funding was for a three-year period (April 1989 to April 1992) subject to extension following a progress review.

From its inception, the HGMP sought to build an identity that distinguished it from other human genome projects, especially the one that was already set to start in the USA. The US National Academy of Sciences had issued its report a few months before Peters’ presentation to ACOST and, by 1989, the NIH and DoE’s agreement to join forces in the US-HGP was being ironed out. Given the extraordinary budget and timeframe of the US effort—three billion dollars over 15 years—an early concern for the HGMP was how to make a differentiated contribution with a fraction of the money and a much more limited time horizon.

Tony Vickers, the HGMP manager, argued in his first report to the MRC in 1991 that in the UK there was “no individual enthusiasm” for becoming involved “in mega-sequencing”, a task that was “unlikely to yield rewards to compensate workers for the drudgery involved”. The British biomedical community, however, had “substantial strengths” in “many fields of genetics” where human genome mapping offered “promise of immediate and substantial pay-off”. This short-term pay-off had somehow been “left aside” by the US-HGP with its focus on comprehensive, large-scale work at genome centres that were distant from the communities that would use the map and sequence data. The HGMP sought to take advantage of the prompt exploitation of results by involving the human and medical genetics communities in the mapping exercise.Footnote 29

Consequently, the research grants awarded by the Directed Programme supported groups that were either developing mapping and sequencing technologies, creating shared resources to aid in these operations, or focusing on chromosomal regions connected to various types of genetic conditions, among them disorders affecting blood (haemophilia), mental health (aneuploidy syndromes) and muscular mobility (myotonic dystrophy). Of the five institutions in receipt of the largest amount of funding (Fig. 3.4), four of them investigated different aspects of medical genetics: ICRF, the Human Genetics Unit of the University of Edinburgh, the Institute of Molecular Medicine at John Radcliffe Hospital in Oxford and Guy’s Hospital in London.Footnote 30

Fig. 3.4
A table has 4 columns. The column headers are place, institution or department, number, and value. Each row gives information specified in the column headers.

The level of grant support per institution that the UK Human Genome Mapping Project had awarded by 1992, in thousands of pounds. Of the overall £2,011,000 that the LMB (Laboratory of Molecular Biology) received, justover half (£1,150,000) went to co-fund the start of Sulston and Coulson’s C. elegans sequencing project. The C. elegans grant was an outlier in the funding policies of the HGMP and, as such, is further examined in Chap. 4. Source: “Table 1: Distribution of HGMP awards (numbers and volume) amongst centres” in T. Vickers (1992) “MRC Review of the UK Human Genome Mapping Project: Project Manager’s Report”, p. 13. Report courtesy of Tony Vickers; Table 1 reproduced by kind permission of the Medical Research Council, as part of UK Research and Innovation

The outcomes of the Directed Programme grants were delivered to the Resource Centre. This institution was housed in the Clinical Research Centre, a unit that the MRC had established in 1970 at Northwick Park Hospital (in northwest London) to foster collaboration between biomedical research and clinical practice. The Resource Centre was organised into two divisions that were headed by a biological manager (Ross Sibson) and a computing manager (Martin Bishop). Their duties involved assisting HGMP awardees in various capacities, from conducting mapping and sequencing work on request, to providing punctual support through their advanced technology and expertise (Balmer, 1998; Glasner, 1996). To do this, Sibson and Bishop’s teams liaised with the so-called “user community”, addressed their feedback and ensured access to the shared resources. They also collated the map and sequence data coming from the grant-supported laboratories.Footnote 31

By 1991, a probe bank and a library of Yeast Artificial Chromosomes (YACs) were being transferred from their originators—all of them Directed Programme awardees—to the Resource Centre. The probe bank had been compiled by Nigel Spurr, a researcher at Clare Hall Laboratories, one of the ICRF divisions that Bodmer had equipped with the latest mapping and sequencing instruments during the 1980s (Weston, 2014, Ch. 4). It consisted of a series of DNA fragments whose known sequence enabled screening and the detection of specific chromosomal locations. The YAC library was a collection of human DNA fragments inserted in yeast cells and kept under controlled conditions in cultures. It was used as a source for chromosome mapping and derived from a collaboration between David Bentley at Guy’s Hospital in London and Kay Davies at John Radcliffe Hospital’s Institute for Molecular Medicine. Both scientists were renowned for applying genetics research to medical problems—presented by the patients of their home hospitals—and were regular recipients of HGMP funding.Footnote 32

On top of housing and managing these shared tools, the Resource Centre started an in-house sequencing programme using complementary DNA (cDNA) methods. These methods allowed researchers to sequence only the DNA that is transcribed to produce messenger RNA, a vital step in protein synthesis. They therefore enabled the capturing of protein-coding genes in the DNA. The HGMP Directed Programme Committee decided, in 1990, that Sibson’s division would apply this technique to “tissue”-specific and “developmental stage”-specific DNA, as well as the mapped fragments that the Resource Centre compiled from grant-awarded laboratories. This approach would produce “cDNA markers” that, combined with the ongoing physical map, would become “a valuable tool for researchers in human genetic disease”. The cDNA component was adopted as a “strategy” aimed at yielding sequence information “in a relatively short time span”, thus being “more practicable than mega sequencing of the human genome”. It was regarded as a “flagship for the UK” and “essential” for achieving “international credibility” and taking “the lead” among the competing genome efforts.Footnote 33

This mode of operation meant that the HGMP pursued a similar strategic approach to the EC’s genome programmes. As the EC was doing for yeast and H. sapiens (Chap. 2), the MRC sought to involve existing genetics research laboratories in its genome project and distribute the HGMP grants among them as inclusively as possible.Footnote 34 This differed from the more selective funding regime of the US-HGP and the wider distance between the large-scale genome centres and genetics research institutions. More fundamentally, the two genome projects differed in their overall goals: whereas the US-HGP aimed for a reference sequence of the whole human genome—something that its much larger budget and timespan allowed—the HGMP restricted its remit to the genome regions on which its user communities were working. These human and medical geneticist users would develop catalogues of variation from the resulting mapped and sequenced regions.

3 Reference Sequence vs Catalogues of Variation

Historically, the production of a reference sequence of the whole human genome was not an objective of the human and medical genetics communities. These communities had indeed engaged in the mapping of the human genome and had done so at an increasing scale since the start of the chromosome workshops, in 1973. However, they had always limited the scope of their efforts to the regions of interest to the genome mappers: geneticists studying specific diseases or biological traits who pooled their results on the chromosomal locations of genes or genetic markers with other community members. The HGMP and the EC’s Human Genome Analysis Programme (HGAP) had built on this collective endeavour and sought to foster it with ring-fenced funding, international networking and resource centres that provided technical assistance and shared mapping technologies, as well as cDNA sequence data. Yet, as the support of these programmes was tailored to human and medical geneticists, the mapping and sequencing results were constrained to the genes and markers they were pursuing, rather than covering the entire human genome.

Human and medical geneticists would deem these genes and markers to be mapped at sufficient resolution when they could be assigned to a precise DNA fragment. Once this happened, the fragment would often be sequenced and compared with equivalent genome regions. These comparisons were made between humans and closely-related non-human species, or between healthy individuals and patients suffering the condition with which the gene or marker was associated. The mapping and sequencing processes combined collaboration—at chromosome workshops and more specific groupings, often deploying cDNA techniques—with competition for being the first to determine the chromosome locus or sequence of a gene or marker. A source of inter-species comparison was the growing number of databases with map and sequence information from simpler organisms, such as S. cerevisiae or C. elegans, that were being compiled through either their own specific programmes or as a result of funding from human genome efforts. In this regard, both the HGMP and HGAP supported the consolidation of mouse data repositories, an organism evolutionarily much closer to H. sapiens than yeast or a worm, and from which both medical and developmental inferences could be made.Footnote 35 To access data from patients, medical geneticists created consortia—some of them also sponsored by the HGAP (Table 3.1)—that enabled them to uncover genes involved in diseases and compile catalogues of genetic variants associated with the conditions.Footnote 36

Table 3.1 An example of a consortium of institutions pursuing medical genetics goals: the European Gene Mapping Project (EUROGEM), supported by the European Commission’s Human Genome Analysis Programme (HGAP). The consortium included institutions involved in genome mapping activities and resource centres. None of these institutions participated in the determination of the human reference sequence nor in the whole-genome physical mapping that aided the sequencing (compare with Chap. 4, Table 4.1). Elaborated by Miguel García-Sancho and Jarmo de Vries, from data collected by Hallen and Klepsch (1995, esp. p. 20)

These catalogues of variation were often curated at hospitals with strong genetics departments. They formed repositories to which the rest of the community could contribute data, and from which they could access it. The HGMP Resource Centre and other similar central facilities that the HGAP developed shared this philosophy through the community-built and collectively-accessible probe banks, YAC libraries and map and sequence databases they offered to their users.Footnote 37 These shared resources were themselves the product of collaborative projects that the resource centres and genetics research laboratories jointly undertook with funding from the HGMP or HGAP (Table 3.1).

In the mid-1960s, before the arrival of DNA sequencing techniques, McKusick had pioneered these types of collections in Mendelian Inheritance in Man, a catalogue of annotated chromosome maps that was first published as a series of printed volumes and later as an electronic database (Online Mendelian Inheritance in Man). Both the volume series and database incorporated updates with new data stemming from the chromosome mapping workshops and other disease-specific consortia, as well as clinical information about the underlying genetic conditions (Lindee, 2005, Ch. 3; Hogan, 2016, Ch. 3).

With the growth and development of physical mapping and sequencing techniques across the genetics community from the late-1980s onwards, both the workshops and variation catalogues became more specific: the former devoted to single chromosomes and the latter to individual diseases. An early example of this followed from the mapping of the cystic fibrosis gene in 1989, the first condition to be assigned to a physical location, in this case in human chromosome 7. One of the mapping scientists, Lap-Chee Tsui, was subsequently appointed as co-convenor of the chromosome 7 mapping workshops. Tsui also established the Cystic Fibrosis Genetic Analysis Consortium and coordinated the compilation of sequence variants connected to different forms of the disease that were determined by researchers all around the world. The results were gathered in a database that is still active at the University of Toronto Hospital for Sick Children—Tsui’s home institution until 2004—and used to diagnose the condition.Footnote 38

During the mid-to-late 1990s, Tsui’s endeavour developed into a map encompassing the whole of chromosome 7. A younger member of the Toronto team, Stephen Scherer, built on the networks around the cystic fibrosis consortium and chromosome workshops to create a growing map with assignments associated with other conditions and loci. Scherer’s collaborators included both medical geneticists and institutions working on the comprehensive mapping and sequencing of chromosome 7, among them the genome centre at Washington University. Yet the objective of Scherer’s map was not to serve as a platform for the sequencing of the entire chromosome. Rather than pursuing a single reference sequence—as Washington University and the other genome centres did—Scherer and his fellow medical geneticists sought a way of detecting, mapping and cataloguing variation. Their map was a means of obtaining a set of ordered DNA fragments, some of which could be compared to data derived from patients. That way, differences in both fragment size and pattern, or underlying DNA sequence, could be connected to particular conditions and assigned to specific chromosomal locations.Footnote 39

The pursuit of variation by medical geneticists contrasted with other communities working on non-human organisms. Compared to the HGAP, the EC used a different strategy for yeast and sought a full reference sequence of its genome (Chap. 2). Apart from the extreme discrepancies in genome size, this divergent strategy was due to the aims and necessities of yeast geneticists, biochemists and cell biologists being distinct from those of the communities working on human DNA. While human and medical geneticists were interested in sequence differences underlying disease or other traits, the consortium of laboratories that undertook the EC’s Yeast Genome Sequencing Project aimed to use this organism to model the functioning of the eukaryotic cell. Each community, therefore, approached its target genome in a different fashion. In the case of the human genome, the focus was on comparing specific regions—those where genes were located—across either different species or hospital patients versus controls. In the case of yeast, the laboratories in charge of the sequencing project used this organism as a “wild type” (Holmes, 2017) and pursued a standardised description of its genome, in order to relate the sequence data to functional aspects of cell genetics and metabolism. For this reason, they targeted a specific strain—S288C of S. cerevisiae—as representative of the yeast species as a whole and did not address variants until the full reference sequence was completed (Szymanski et al., 2019).

Similarly, within the history of molecular biology, substantial efforts had been devoted to achieve comprehensive descriptions of “exemplary” model organisms: viruses and bacteria first and further unicellular and multicellular organisms from the 1970s onwards (quote from Strasser & de Chadarevian, 2011; see also: Creager, 2002; Kay, 1993; Ankeny & Leonelli, 2020). The hope was that, as with the S288C strain of S. cerevisiae, those organisms would enable researchers to connect genes to different biological mechanisms and processes, and their effects. This, therefore, paralleled the goal of Brenner’s C. elegans project, and Sulston’s mapping and sequencing of the full genome of the worm. Like the yeast communities, molecular biologists would use the exemplary descriptions and descriptive models (Ankeny, 2000) as the basis of comparative practices. Unlike S. cerevisiae, however, the reference sequence of C. elegans could not be traced to a specific population.Footnote 40

Brenner considered the human genome to be too large and complex for an equivalent description to that being pursued for C. elegans, and so aligned with the human and medical genetics communities through the proposal of the HGMP. Yet, on the other side of the Atlantic, Watson found in the US-HGP the timeframe and resources needed to export the exemplary descriptive approach to the human genome. His genome centre model sought to fully describe the human genome as a standard or wild type, by producing a reference sequence rather than selectively tackling and comparing regions, as human and medical geneticists had traditionally done. This is what has led Hilgartner to identify Watson with a “vanguard” that consolidated genomics as an independent field, one that could be distinguished from other life sciences disciplines (Hilgartner, 2017).Footnote 41 In this differentiation, however, the large-scale centres that produced the reference sequence became both separated and distant from the genetics laboratories that would use the data and that were often involved in other forms of conducting genomics, more aligned with the approaches of the HGMP and the EC’s programmes.Footnote 42

The US-HGP dominates the historiography of genomics. As we have argued, however, its model of organisation was the exception rather than the rule during the formative years of genomics research. In the previous chapter, we conveyed the heterogeneous array of institutions, genomicists and organisational models involved in yeast genome sequencing. In this chapter, we have documented the diversity that also characterised human genomics. Taken together, both chapters show that the model of the US-HGP—with its large-scale centres and comprehensive sequencing regime—falls short in representing not only the history of genomics but also of the more specific subfield of human genomics (Fig. 3.5).

Fig. 3.5
A chart has sections and categories for the human genome project, human genome mapping project, and human genome analysis program.

An outline representation of the US Human Genome Project, UK Human Genome Mapping Project and European Commission’s Human Genome Analysis Programme. Only aspects that have been discussed in the chapter are included and there are some notable absences, such as the programmes on ethical, legal and social aspects of genomics research that the three initiatives supported. Elaborated by both authors. For a larger version of this figure that can be zoomed in and out, see

In the next chapter, we identify the factors that led to a growing concentration of institutions and productive capacity during the determination of the human reference sequence. The transition of the C. elegans project from mapping to sequencing—along with the rise of the Wellcome Trust as an influential, proactive funder—spread the genome centre model beyond the USA and made it dominant in human reference genomics towards the mid-to-late 1990s. This process, we argue, not only affected scientific practice and organisation: it also occluded other historical trajectories in favour of the canonical winners’ story based on the US-HGP.