1 Introduction and background

From the earliest domestication of plants and animals to the present, agriculture has depended on the movement, management, and manipulation of genetic resources. However, the conservation of genetic diversity offers a potentially significant example of market failure. Farmers choose crop varieties and animal breeds in response to their own incentives. Their decisions do not account for the embedded public goods. As a result, the public sector has been closely involved in the conservation and management of genetic resources for agriculture, dating back as far as the establishment of botanic gardens in the 18th century.

In recent years, new challenges have arrived for the conservation of useful genes for agriculture. Increased pressure on global land resources has created a broad threat to biodiversity, and within agricultural systems, production pressures continue to drive out traditional varieties, landraces, and other genetic resources with potential future value. Climate change poses further threats to biodiversity, altering the habitats and ecosystems that support many crop wild relatives. At the same time, shifting technologies have allowed for new ways of collecting and conserving genetic resources; in addition to in situ collections of plants and animals, and beyond collections of seeds in cold stores, there are now a variety of cryopreservation techniques, and the unit of conservation can include semen, blood, eggs (in some cases), embryos and a range of tissues, as well as extracted DNA. In addition, the patterns of use have shifted dramatically for conserved genetic resources; scientists increasingly make use of these materials for research purposes rather than as direct inputs into plant or animal breeding programs. Many genetic resource collections are now used primarily for genomic work and bioinformatics, which then drive breeding programs indirectly.

The new technologies for collection, conservation, and utilization of genetic resources for agriculture imply a need for a refocusing of global strategies for the management of these resources. A range of questions arise. Should conservation efforts focus as intensively as in the past on domesticated species and their immediate wild relatives? Or should conservation be broader, including a greater array of species that cannot be easily utilized at present but that may provide valuable guidance for future research? What is the right way to prioritize collection and conservation? What are the trade-offs between careful cataloging and curation, on the one hand, and rapid collection, on the other? How should managers respond to the growing evidence that many scientific collections ultimately prove useful for purposes other than those that were originally intended?

Many of these questions have a technical basis in genetics and biological science; but they are also fundamentally questions about optimal management and value. With scarce resources to allocate across these different potential activities, economics can contribute useful insights into the trade-offs and decisions that will guide future global and national strategies. This paper will argue, however, that rapid advances in bioscience and technology, along with fundamental unknowns about the future, make it challenging to apply simple theoretical models to key questions about gene bank management.

2 Background

Efforts to conserve and manage genetic resources are neither new nor even, perhaps, uniquely human. Agricultural domestication, fundamentally based on human and natural selection of genetic resources, is arguably just a special case of plant-animal co-evolution (Purugganan and Fuller 2009). In human history, the earliest domestication of plants and animals involved the selection, curation, and purposeful management of genetic resources. And in more modern history, systematic collection and conservation of useful species has taken place across societies, as described by Plucknett and Smith (2014), among others. Antecedents of today’s gene banks include gardens for medicinal plants, botanic gardens, and conservatories. In the eighteenth and nineteenth centuries, imperial powers sought to collect and control genetic resources with commercial uses, hoping to dominate emerging markets for commodities such as cocoa and rubber; see, for example, (Frankel et al. 1970; Frankel et al. 1995).

The rediscovery of Mendelian genetics in 1900 (and even more so the subsequent implementation of Mendel’s thinking by Fisher, Wright, and Haldane) provided a theoretical framework for systematic plant breeding and suggested a direct productive value for genetic resources. This in turn inspired scientific efforts to collect and catalog different varieties of crop species and crop wild relatives. Vavilov’s work on domestication and the distribution of crop varieties in turn supported a view that crop diversity might provide valuable traits, selected in response to local environmental pressures, reinforcing a view that landrace varieties might be valuable for modern agriculture (Nikolaj et al. 1997). This idea, taken with the advances in modern plant breeding that characterized the early twentieth century, appears to led to a huge demand for seed collections linked to agricultural science programs.

Reflecting the growing interest in seed collections, the term “seed bank” appears to have come into use in the early 1920s (according to Google’s Ngram data). The term “gene bank” first appeared in the mid-1950s – a coinage that conveyed both the breadth of conserved material (i.e., not only seeds) and the notion that the seeds were valuable precisely because they embody useful genes (rather than as plant specimens). The term “crop genetic resources” then entered the literature in 1965. This term combined two key ideas: first, that crop diversity has value for production; and second, that this value resides precisely in the embodied genetic material. In parallel with these linguistic developments, crop seed banks were established in many countries by the beginning of the 20th century, and the collections of the international agricultural research centers (IARCs) took shape almost simultaneous with the founding of these institutions in the 1960s and 1970s.

It is perhaps worth noting that the interest in genetic resources for agriculture went along with broader public and scientific concern with genetic resources. Although the study of plants for human uses had, of course, been around for many centuries, the field of “economic botany” became formalized in the mid 19th century as an interdisciplinary effort to document and measure the utilitarian values of plant genetic resources. Footnote 1

And by the late 1980s, the broader concept of “biodiversity” took hold in public conversation, as traced by Farnham (2007). Footnote 2

Today, gene banks and seed stores have an established place in the world of agricultural research. The scope of collections extends far beyond seeds of cultivated plant species. The Millennium Seed Bank at Kew contained (as of June 2018) approximately 2.25 billion seeds in storage from 189 countries, representing nearly 40,000 species. USDA collections of animal genetic resources include cryopreserved semen, embryos, ovaries, testis, and blood, representing not only the dominant livestock species but also honey bees, shrimp, oysters, and sea slugs. Additional collections extend to clonally propagated fruit trees and microbial culture collections, as well as collections of algae and fungi.

Gene banks are expensive. Global expenditures on gene bank operations probably run into the hundreds of millions of dollars, although precise estimates are elusive.Footnote 3 The main categories of costs include collection, cleaning, and cataloging; capital costs for storage facilities; energy costs for refrigeration; staff costs for managing collections; costs for regeneration and multiplication of materials (primarily for seeds); costs of safety duplication (e.g., in the Global Seed Vault in Svalbard); and costs associated with use, such as the distribution and documentation of samples.

In the past, scientists have viewed gene banks as complementary to in situ forms of conservation of genetic resources for agriculture. But the forces of globalization, climate change, and market integration have together undermined the viability of in situ conservation in many settings. Modern agricultural value chains tend to reward genetic uniformity, rather than variability and diversity, and the pressures of intensification and globalization have tended to erode traditional agricultural practices, including the cultivation of landraces.Footnote 4 Combined with climate change and other pressures that have affected the ecosystems supporting crop wild relatives, these forces pose challenges for reliance on in situ conservation.

If in situ conservation approaches are potentially unreliable, and if ex situ conservation is expensive, economic questions become central to global strategies for conserving genetic resources for agriculture. Two questions seem particularly central: What should we collect and conserve? How (in what form) should we maintain these collections?

The answers to these questions are closely linked to the technologies that are available for working with genetic resources. The term “technology” here refers to the tools and approaches through which genetic resources are used in agriculture – ultimately for improving the performance of agricultural systems, defined broadly. New technologies often reflect (and are frequently derived from) new understandings of “emerging science” An example of an improved technology might be gene editing, using CRISPR; this is a tool developed with modern genetic science that allows for fast, cheap, and accurate editing of genomes. By contrast, emerging sciences would include more fundamental areas such as computational bioscience or epigenomics or systems biology, where new research expands our understanding and may challenge key assumptions about agricultural systems. Emerging science may eventually lead to technological breakthroughs. Recent decades have seen enormous advances in both science and technology relevant to genetic resources for agriculture. Looking ahead, strategies for collection, conservation, and utilization will surely reflect technological advances and emerging science.

3 Literature review and theory

Economic literature offers many useful theoretical insights that can guide thinking about the economics of genetic resources; especially on the benefits of diversity; e.g., (Dasgupta 2000; Brock and Xepapadeas 2003; Kassar and Lasserre 2004; Falco and Chavas 2008; Perrings et al. 2009). The literature also includes numerous thoughtful analyses of the uses of genetic resources for R&D and the divergence between private incentives and social values. Useful works in this vein include (Goeschl and Swanson 2002, 2003; Sarr et al. 2008). Similarly, there are useful discussions of property rights regimes for genetic resources; e.g., (Swanson and Goeschl 2000).

However, the theoretical literature offers little guidance on the question of exactly what to conserve or how to conserve it. On the question of what to collect, perhaps the closest is the work of (Weitzman 1992), which articulates a set of key principles related to the notion of maximizing the genetic distance covered by a collection of conserved materials. In a sense, his theory reflects a belief that closely related species are redundant. Defining genetic distance is, of course, elusive. Small differences in the genome sequence can correspond to large material differences in phenotype – or at least to differences that matter from the perspective of human beings. Weitzman (Weitzman 1992) suggested that the appropriate measure of genetic distance should based on heredity, in the sense of cladistics: “[D]efine the distance between any two species as the time ago when they diverged from a common ancestor.” In subsequent work, Weitzman (Weitzman 1993; 1998) sought to show that his theoretical insight could be implemented, using an example from crane conservation as well as a more abstract effort to explore the “Noah’s Ark” problem.

Weitzman’s theory is silent about the eventual value of the collection; conserving diversity is an end in itself. This would be a sensible criterion for an expedition to Earth by aliens intent on collecting a sample of all life forms, unconstrained by issues of cost or difficulty. But it is unclear that Weitzman’s approach yields useful guidance for real-world collections. For instance, Weitzman’s approach would tend to privilege keeping, for example, two species of Army ants that last shared a common ancestor in the mid-Cretaceous, perhaps 40-50 million years ago, over keeping both humans and chimpanzees (common ancestor 5-7 million years ago). Collections of within-species variation in crop plants would be deprioritized relative to more random collection of seeds. This view is logically sound and perhaps defensible, but it seems problematic that the approach entirely ignores utilitarian criteria as well as cost considerations – both of which arguably ought to matter to economists.

Weitzman’s approach assumes that diversity matters intrinsically – or at least that the future uses of the collection are unknown. This contrasts, at the other extreme, with a scenario explored by (Simpson et al. 1996), in which an in situ collection of rainforest species is being screened for a single valuable trait (e.g., a cure for cancer). In other words, the value of the collection will come from its ability to solve a single known problem – precisely the opposite of Weitzman’s framing. In the simplified model of Simpson et al., each species is equally likely to contain this valuable trait. (Simpson et al. 1996) showed that under many (or most) plausible scenarios, their model implies a very low marginal value of a species in the collection. The logic is compelling. If the probability of finding the desired trait in a given species is high, then a large collection is unnecessary because of redundancy. If the probability is sufficiently low, then the marginal value of an additional accession is also low, because the marginal accession is unlikely to have the useful trait. Their result is only slightly eroded if the target trait is quantitative rather than qualitative; in that case, the search for an incrementally better accession leaves more value on the margin (Gollin et al. 2000).

In the real world of gene banks, however, neither of these approaches seems adequate. Maximizing genetic diversity, by itself, is not a satisfactory criterion (although it is one that we want to keep in mind). In addition to diversity, we might want to consider the relative costs of collecting different materials: some things are easy and inexpensive to collect and conserve; others may be extremely costly. We might also want to assign non-zero weights to criteria such as: benefits to humans; the extent of current threats to survival; costs of maintaining and regenerating material; and so forth. Practical considerations might also matter: since time and travel costs are important in collection, it frequently makes sense to collect exhaustively in particular locations, once a collection team has arrived. If indeed it makes sense to collect exhaustively in particular locations, then perhaps a sensible goal might be to prioritize collection of genetic resources in locations or ecologies that are both unique and potentially vulnerable.

Other issues come into consideration as well. A time capsule of genetic resources, intended for the benefit of future humanity, would look different from a collection that is intended to support research in the present. A one-time collection – a snapshot of genetic resources at a particular moment in human history – would also look different from a collection that is intended to capture the evolving biology of the planet or of any given ecosystem. The point here is that it is difficult to answer the question, “What should we collect?” without defining a specific purpose for the collection.

The challenge, of course, is that the future uses of gene banks are not known (and indeed not knowable). Moreover, we do not know much about the technologies that we will eventually use to extract benefit from collections of genetic resources. And we do not even necessarily know what aspects of the collection will prove most valuable. Will we actually care only about the DNA sequences of accessions in the collection? Or will we instead find that the accessions provide insights beyond their DNA content? Given these uncertainties, no simple theory can provide much guidance; we are faced inevitably with a set of strategic judgments – and with no way of knowing for sure whether we are making the right choices.

4 Changing uses and new understanding

Today’s crop gene banks have emerged in response to two different intended purposes: first, the mobilization, management, and long-term storage of materials that can be readily used in crop variety improvement programs; and second, the long-term conservation of crop genetic diversity, for the potential future use of humanity. These are worthy objectives, but both the scope and structure of the crop gene banks reflects a model from the mid-20th century, rather than a strategic vision for the (approaching) mid-21st century. In terms of scope, the current gene banks focus on varietal diversity within crop species, plus a subset of crop wild relatives. These collections respond to a rational view that when collection and conservation are costly, it makes sense to prioritize material that is directly useful. A concern is that this approach tends to prioritize present knowledge and utility over potential future uses. For instance, the rapidly moving frontier in the biological sciences suggests that we may find increased value in germplasm sources at greater phylogenetic distance from crop species.

From their beginnings, the crop gene banks have been used very little in direct breeding of new varieties. As documented in (Gollin et al. 2000), breeders view landrace materials as costly to work with, since they contain many undesirable traits along with potentially desirable ones. Using a landrace in a cross may thus introduce unwanted characteristics, which need to be removed by extensive back-crossing, a time-consuming and expensive process. An implication is that many modern breeding programs primarily use gene banks for genomics and gene discovery, or for other upstream scientific research. It is not uncommon for scientists to examine a broad set of genetic materials to isolate interesting genes or to identify different genetic mechanisms – but then to turn around and look for the same genes in improved varieties.

Increasingly, however, plant scientists are drawing on better fundamental understanding of plant genetic pathways and mechanisms. Convergent evolutionary processes mean that it is sometimes possible to seek (and find) usefully homologous structures in species and even in families that are unrelated to crop and livestock species. We must ask: Where will useful diversity be located for future agriculture?

In the past, the assumption was that the most useful diversity was across varieties – i.e., in large collections of landrace materials and wild relatives. Varieties were assumed in turn to be relatively uniform, genetically speaking. In many cases, individual accessions consisted of relatively few seeds per plant and relatively few plants per variety. For instance, (Chang et al. 1972), in an early manual on conservation of rice varieties, suggested that an accession should consist of about 100g of seed, which corresponded to about 30-40 panicles, ideally one per plant in a field. This reflected a view that a field was likely to be planted with relatively uniform landraces, with a moderate amount of genetic variation across plants and very little variation within the seeds in a given panicle.

Recent genomic studies, however, have challenged some long-standing views of where genetic variation resides. Studies of accessions in seed banks have found non-trivial diversity within accessions, as well as a substantial amount of duplication across accessions. The distribution of genetic variation does not necessarily correspond to variation in geography or phenotypic characteristics; for instance, one recent study of pigmented (red) rices in the Philippines found more genetic variation among varieties within regions than across regions.(Mbanjo et al. 2019) Many other examples can be found for plant genetic resources, and similar findings apply to livestock. Prior to widespread use of genetic analysis, a stylized view among livestock scientists was that roughly half of genetic variation was across breeds and half within breeds; molecular analysis now shows consistently that variation within breed accounts for much more genetic variation than variation across breeds (Harvey Blackburn, pers. comm., August 2018). In other words, breeds may not be genetically very distinct; or distinctions reflect relatively superficial traits. But variation within breeds may be substantial, so that genetic gain can result from intensive breeding within breeds – as has been the case for Holstein cattle. In short, categories such as “breeds,” “varieties,” and “landraces” are not necessarily biologically meaningful concepts. Even species boundaries are not always clear. The old understanding of where diversity is concentrated needs badly to be refreshed on the basis of large-scale genomic studies (among the suggestions made by (McCouch et al. 2012).

5 The value of information

The new molecular findings reinforce the message that information about the materials in a gene bank is an important complement to the accessions themselves. (Gollin et al. 2000) showed that information can speed up the process of searching a gene bank for useful material, by allowing more precise targeting of the search. But “information” is of course a limitless concept. (McCouch et al. 2012) sketch out some useful approaches to systematic genomic analysis of gene bank materials. Clearly, new technologies and high-throughput sequencing techniques will allow a far richer picture of what is contained within the existing crop gene banks. And emerging sciences (e.g., proteomics, transcriptomics, bioinformatics) may reveal additional information about the biological characteristics and capabilities of individual accessions.

But what other information do we need to accompany their DNA sequences? In addition to the basic passport information, there are potentially valuable records from the point of collection. Some databases make it possible to link the accessions to data from studies on characterization and evaluation of germplasm. Arguably, the more information we have about the accessions, the more useful they are.

A non-trivial point is that “useful information” is not confined to the DNA sequences of accessions. In recent years, interest has grown in the possibility of maintaining gene banks and other collections of genetic resources simply as digitized or “dematerialized” collections. For some purposes, the physical seeds are already less useful than the sequenced genomes; e.g, for scientists trying to associate particular genes or gene complexes with phenotypic traits.Footnote 5 Gene editing techniques already make it possible – at least in principle – to alter the genome of a variety to match a desired profile, without going to the trouble of breeding and backcrossing. As genetic technologies improve, the physical seeds will presumably become less and less important; scientists will more often be able to manipulate directly the target genomes. But it is an open question whether scientists will find it easy to improve on the combination of natural selection and human selection that have led to current genetic combinations in the form of crop varieties; the application of artificial intelligence to this problem is perhaps a promising frontier, but one that remains in its infancy (Washburn et al. 2019).

In the same spirit, although genetic technology is improving rapidly, it may yet be some time before scientists can realistically move useful traits across life forms in productive ways. The limiting factor may not be the technical capabilities of genetic modification or gene editing; instead, it may be the fundamental difficulty and complexity of biology. Many traits are multigenic, and moving gene complexes remains challenging. Even where the genes themselves are simple, the expression of genes is not always straightforward. This suggests that conserving dematerialized DNA will not be a sufficient strategy for some time, although it may be useful.

A broader point is that important values of gene bank collections may ultimately arise from sources other than the DNA of their accessions. Consider the many examples of scientific collections that have yielded valuable insights into issues far removed from the original goals of the collection. For instance, herbarium specimens have shed light on long-term shifts in plant flowering times, related in turn to climate change (Primack et al. 2004). Surgical samples collected by the United States military since the Civil War – and especially those from victims of the Spanish flu epidemic – have provided insight into the management of influenza outbreaks. (Morens et al. 2008) Another example is the use of lichen collections to provide evidence on historic changes in air pollution (Purvis et al. 2007). These examples illustrate the point that future uses of collections may take advantage of dimensions of the collections that go well beyond the DNA content of the accessions. Today’s archaeologists will often leave segments of their finds unexcavated, because of the awareness that future generations will approach sites with new tools, questions, and sensibilities. In the same way, we should think of today’s gene banks as repositories of material with uses and values that are not yet apparent.

6 Implications for economic analysis

In a context with so much uncertainty, what role can economics play? There are a number of categories of studies that can help shed light on the problems of genetic resource conservation. Some of the biggest questions do not lend themselves to straightforward analysis, but there are useful analyses that can be undertaken.

6.1 Valuation studies

Although it is frequently tempting to assign values to entire collections, there are huge methodological and conceptual difficulties in conducting such studies. Valuing collections is a difficult – but barely feasible – task even in the best case, such as when the materials have well-defined uses in agriculture or pharmaceuticals. For instance, it might be possible to assign a value to a commercial collection of yeasts used in brewing or baking. But for a collection that includes many non-commercial accessions, wild relatives, or other material with little or no current economic use, valuation is arguably an impossible task – at least, without making a lot of untenable assumptions.

One approach – evident in a number of papers in this issue – is to assign a lower bound to a collection by documenting instances when the collection has yielded material of measurable value. These “winning lottery tickets” can sometimes show that the benefits of the collection exceed all past costs – and possibly even exceed any plausible future cost. However, the past successes tell us little or nothing about future uses of the collection. And as noted above, changing biological technologies may alter the future relevance of gene banks as repositories of useful traits. Possibly the value of collections will be increased by our growing ability to identify, transfer, and manipulate desirable genes. But possibly the value of collections will be diminished by our ability to synthesize organic compounds and to create entirely new sequences.

Finally, it is important to recognize that there may be values for a collection of genetic resources that go beyond their use values. A reasonable case may be made that an entirely utilitarian approach – assigning value to a collection based on its future usefulness to humans – is ethically inadequate. In essence, this approach would suggest that species and genera should be conserved only to the extent that they are beneficial to humans. Not only does this ignore the potential value of conserving intact ecosystems, but it also limits our ethical universe in a way that is perhaps indefensible.

6.2 Prioritization

Instead of focusing on valuation, a more promising area of research would focus on establishing priorities for collection and conservation. Given that we cannot immediately collect and conserve everything, where should we start? What is the objective function that we seek to maximize? If we know the goals, then economic tools are useful in thinking about how to achieve the best possible results given a limited budget. For example, we might want to maximize some combination of “novelty” (meaning the pure genetic distance from other materials in the collection) and “use value” (a characteristic reflecting the current economic use value attached to a genus or species). Materials might also differ in the costs of collection and conservation. Given these objectives and constraints, we could formulate a “cost effectiveness” criterion that would allow us to maximize the benefits per dollar spent.

6.3 Conservation cost studies

Because the fixed costs of buildings and cold storage facilities are large, it sometimes appears that the average cost of conservation ex situ is very high. But this is a misleading measure. Once the facilities are in place to store materials, we should be more interested in marginal costs; in other words, the incremental cost of adding one more accession to the gene bank. This cost can be remarkably low, even taking into account the costs of periodic regeneration. For example, Koo et al. (2003) and Kim et al. (2003) examined in detail the operations and cost structures of five gene banks associated with CGIAR crop research centers. They discovered that the costs of conserving crop genetic resources – even including the costs of regeneration – are often only a few dollars per crop variety or item conserved. Given low marginal costs, it is easier to make the case that it is worthwhile to collect an additional species or variety of plant or animal; conversely, if some materials turn out to be hugely expensive to conserve, it may be more difficult to justify the costs of conservation. Comparable figures have not yet been collected for animal genetic resources or non-agricultural genetic resources, but in many cases, costs of collection and conservation are likely to be similarly modest. Such costs are fairly easy to document, and studies of this kind may be able to persuade policy makers that the total cost commitments are well below the likely benefits of collections..

6.4 Winning lottery tickets

Supporters of gene banks like to point to examples of valuable products derived from nature and to argue that these are the examples that demonstrate the value of genetic resources. This approach needs to be managed carefully. One challenge is to avoid conflating the value of the genetic resources with the value of the research discovery, and commercialization processes. Perhaps more important, there is a difficulty that arises from the focus on “winners.” This approach is comparable to looking at a group of people who have won the lottery and arguing that their success shows that it is a profitable business to buy lottery tickets. Studies of this kind must be careful not to make unfounded claims about future value. It is both accurate and useful to show that demonstrated past benefits might exceed past and future costs. But what would be particularly useful would be also to ask questions about the frequency with which valuable materials have been found; i.e., to develop a statistical model of the likelihood of finding new “winners” in the collection.

7 Conclusion

The conservation of genetic resources is an important topic for economics. Public expenditures are already large, and threats to genetic diversity imply that there may be a short window of time in which additional collection must take place. The added threat of climate change, which may induce large losses of diversity, underscores the pressing need to re-evaluate strategies of collection and conservation. This paper has argued that existing economic theory struggles to address the real-world problems of gene bank management. Similarly, the unknown (and unknowable) future uses of genetic resources make it difficult to conduct simple economic analyses. Nevertheless, there are pressing needs for economic research. The most constructive directions for research may not be on assigning overall valuation to collections, but rather on thinking through the prioritization of conservation. Prioritization needs to look beyond the Weitzman criterion of maximizing genetic distance; instead, a broader set of criteria must come into play. And in spite of growing enthusiasm for reducing collections to dematerialized and digitzed genomic sequences, we should recognize that the collections of physical specimens – of seed, tissue, semen, and the like – may have value beyond the DNA code. Future researchers are likely to find the collections valuable in ways that we do not fully appreciate today, and that we cannot readily anticipate.