Conservation Genetics

, Volume 19, Issue 6, pp 1281–1293 | Cite as

Biases induced by using geography and environment to guide ex situ conservation

  • Patrick A. ReevesEmail author
  • Christopher M. Richards
Research Article


Ex situ germplasm collections seek to conserve maximum genetic diversity in a small number of samples. Geographic and environmental information have long been treated as surrogate measures of genetic diversity, proposed to be useful for increasing allelic diversity of collections. We examine the effect of maximizing geographic and environmental diversity on the retention of distinct haplotype blocks in germplasm subsets, using three species with extensive genomewide genotypic data. We show that maximizing diversity in the surrogate measures produces subsets with uneven representation of haplotypic diversity across the genome. Some regions are well-conserved, exhibiting high haplotypic diversity, while others are poorly-conserved and contain significantly less haplotypic diversity than would be obtained via random sampling. In two of three species, poorly-conserved genomic regions were enriched in regulatory genes which, as a class, contribute to phenotypic variation. The specific genes affected varied by species but, overall, haplotypic diversity was poorly-conserved at genes controlling ~ 10% of major molecular functions and biological processes. While this study was limited to three exemplar species, we find little evidence to support continued use of geographic or environmental surrogates for ex situ conservation activities attempting to capture maximum genomewide allelic diversity. Although geographic and environmental diversity have proven to be reliable predictors of allele frequency differences and ecotypic differentiation across species ranges, they appear to be poor predictors of allelic diversity per se, offering little opportunity to enrich collections for haplotypic diversity overall, and ample opportunity to bias the conservation of important functional genetic variation. We propose a bioinformatic bridge between haplotypic diversity and the potential phenotypic diversity residing in collections using the Gene Ontology.


Gene ontology Genetic diversity Genomewide SNP Haplotype block Phenotype 



We thank Kelly Robbins for helpful comments on the manuscript. This research used resources provided by the SCINet project of the USDA Agricultural Research Service, ARS Project Number 0500-00093-001-00-D.


This study was supported through funds provided to the National Laboratory for Genetic Resources Preservation, Plant Preservation Research Unit by USDA-ARS National Program 301.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Supplementary material

10592_2018_1098_MOESM1_ESM.pdf (113 kb)
Supplementary Figure 1. Schematic representation of quantization scheme to convert continuous geographical and environmental variables into categorical data upon which diversity maximization can be conducted. For simplicity, a hypothetical sample of 5 populations from coastal Italy is shown, with geographic data (latitude and longitude) as the target for quantization. Supplementary Figure 2. Biased representation of biological processes and molecular functions in well-conserved genomic regions. The length of bars represents the ratio of the observed frequency of a term to its expected frequency using the plant GOslim ontology. The X axis is scaled as log base 2 to display folddifference between observed and expected values fairly. Values above one indicate enrichment of GO terms in regions of the genome where haplotypic variation was elevated. The 10 most biased GO terms are shown. For each species, the top chart shows GO term representation bias in geographic subsets; bottom, environmental subsets. (PDF 113 KB)


  1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29CrossRefGoogle Scholar
  2. Brown AHD (1989) Core collections: a practical approach to genetic resources management. Genome 31:818–824CrossRefGoogle Scholar
  3. Brown AHD (1995) The core collection at the crossroads. In: Hodgkin T, Brown AHD, van Hintum TJL, Morales EAV (eds) Core collections of plant genetic resources. Wiley, Chichester, pp 55–76Google Scholar
  4. Caballero A, García-Dorado A (2013) Allelic diversity and its implications for the rate of adaptation. Genetics 195:1373–1384CrossRefGoogle Scholar
  5. Caballero A, Rodríguez-Ramilo ST (2010) A new method for the partition of allelic diversity within and between subpopulations. Conserv Genet 11:2219–2229CrossRefGoogle Scholar
  6. Carroll SB (2008) Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell 134:25–36CrossRefGoogle Scholar
  7. Davis MB, Shaw RG (2001) Range shifts and adaptive responses to Quaternary climate change. Science 292:673–679CrossRefGoogle Scholar
  8. Diener AC, Ausubel FM (2005) Resistance to fusarium oxysporum 1, a dominant Arabidopsis disease-resistance gene, is not race specific. Genetics 171:305–321CrossRefGoogle Scholar
  9. Diwan N, McIntosh MS, Bauchan GR (1995) Methods of developing a core collection of annual Medicago species. Theor Appl Genet 90:755–761CrossRefGoogle Scholar
  10. Doebley J, Lukens L (1998) Transcriptional regulators and the evolution of plant form. Plant Cell 10:1075–1082CrossRefGoogle Scholar
  11. Flint-Garcia SA, Thornsberry JM, Buckler ES (2003) Structure of linkage disequilibrium in plants. Ann Rev Plant Biol 54:357–374CrossRefGoogle Scholar
  12. Frankel OH, Brown AHD (1984) Current plant genetic resources—a critical appraisal. In: Genetics: new frontiers, volume IV applied genetics, Proceedings of the XV congress of genetics. Oxford & IBH, New Delhi, pp 3–13Google Scholar
  13. Fraser DJ, Bernatchez L (2001) Adaptive evolutionary conservation: towards a unified concept for defining conservation units. Mol Ecol 10:2741–2752CrossRefGoogle Scholar
  14. Friedman C, Borlawsky T, Shagina L, Xing HR, Lussier YA (2006) Bio-Ontology and text: bridging the modeling gap. Bioinformatics 22:2421–2429CrossRefGoogle Scholar
  15. Geraldes A, Farzaneh N, Grassa CJ, McKown AD, Guy RD, Mansfield SD, Douglas CJ, Cronk QCB (2014) Landscape genomics of Populus trichocarpa: the role of hybridization, limited gene flow, and natural selection in shaping patterns of population structure. Evolution 68:3260–3280CrossRefGoogle Scholar
  16. Gillies SA, Futardo A, Henry RJ (2012) Gene expression in the developing aleurone and starchy endosperm of wheat. Plant Biotech J 10:668–679CrossRefGoogle Scholar
  17. Gouesnard B, Bataillon TM, Decoux G, Rozale C, Schoen DJ, David JL (2001) MSTRAT: an algorithm for building germ plasm core collections by maximizing allelic or phenotypic richness. J Hered 92:93–94CrossRefGoogle Scholar
  18. Gross BL, Volk GM, Richards CM, Reeves PA, Henk AD, Forsline PL, Szewc-McFadden A, Fazio G, Chao CT (2013) Diversity captured in the USDA-ARS national plant germplasm system apple core collection. J Am Soc Hort Sci 138:375–381Google Scholar
  19. Hanson JO, Rhodes JR, Riginos C, Fuller RA (2017) Environmental and geographic variables are effective surrogates for genetic variation in conservation planning. Proc Natl Acad Sci 114:12755–12760CrossRefGoogle Scholar
  20. Harrisson KA, Pavlova A, Telonis-Scott M, Sunnucks P (2014) Using genomics to characterize evolutionary potential for conservation of wild populations. Evol Appl 7:1008–1025CrossRefGoogle Scholar
  21. Holbrook CC, Anderson WF, Pittman RN (1993) Selection of a core collection from the US germplasm collection of peanut. Crop Sci 33:859–861CrossRefGoogle Scholar
  22. Jost L (2008) Gst and its relatives do not measure differentiation. Mol Ecol 17:4015–4026CrossRefGoogle Scholar
  23. Kimura M, Crow JF (1964) The number of alleles that can be maintained in a finite population. Genetics 49:725–738PubMedPubMedCentralGoogle Scholar
  24. Knowles LL, Carstens BC, Keat ML (2007) Coupling genetic and ecological-niche models to examine how past population distributions contribute to divergence. Curr Biol 17:940–946CrossRefGoogle Scholar
  25. Konishi S, Izawa T, Lin SY, Ebana K, Fukuta Y, Sasaki T, Yano M (2006) An SNP caused loss of seed shattering during rice domestication. Science 312:1392–1396CrossRefGoogle Scholar
  26. Lasky JR, Des Marais DL, McKay JK, Richards JH, Juenger TE, Keitt TH (2012) Characterizing genomic variation of Arabidopsis thaliana: the roles of geography and climate. Mol Ecol 21:5512–5529CrossRefGoogle Scholar
  27. Lasky JR, Upadhyaya HD, Ramu P, Deshpande S, Hash CT, Bonnette J, Juenger TE, Hyma K, Acharya C, Mitchell SE, Buckler ES, Brenton Z, Kresovich S, Morris GP (2015) Genome–environment associations in sorghum landraces predict adaptive traits. Sci Adv 1:e1400218CrossRefGoogle Scholar
  28. Le Rouzic A, Carlborg Ö (2008) Evolutionary potential of hidden genetic variation. Trends Ecol Evol 23:33–37CrossRefGoogle Scholar
  29. Legendre P (1993) Spatial autocorrelation: trouble or new paradigm? Ecology 74:1659–1673CrossRefGoogle Scholar
  30. Leimu R, Fischer M (2008) A meta-analysis of local adaptation in plants. PLoS ONE 3:e4010CrossRefGoogle Scholar
  31. Li W, Zhu Z, Chern M, Yin J, Yang C, Ran L, Cheng M, He M, Wang K, Wang J, Zhou X, Zhu X, Chen Z, Wang J, Zhao W, Ma B, Qin P, Chen W, Wang Y, Liu J, Wang W, Wu X, Li P, Wang J, Zhu L, Li S, Chen X (2017) A natural allele of a transcription factor in rice confers broadspectrum blast resistance. Cell 170:114–126CrossRefGoogle Scholar
  32. Linhart YB, Grant MC (1996) Evolutionary significance of local genetic differentiation in plants. Ann Rev Ecol Syst 27:237–277CrossRefGoogle Scholar
  33. Lu S, Zhao X, Hu Y, Liu S, Nan H, Li X, Fang C, Cao D, Shi X, Kong L, Su T, Zhang F, Li S, Wang Z, Yuan X, Cober ER, Weller JL, Liu B, Hou X, Tian Z, Kong F (2017) Natural variation at the soybean J locus improves adaptation to the tropics and enhances yield. Nat Genet 49:773–779CrossRefGoogle Scholar
  34. Manel S, Holderegger R (2013) Ten years of landscape genetics. Trends Ecol Evol 28:614–621CrossRefGoogle Scholar
  35. Mao H, Wang H, Liu S, Li Z, Yang X, Yan J, Li J, Tran L-SP, Qin F (2015) A transposable element in a NAC gene is associated with drought tolerance in maize seedlings. Nat Commun 6:8326CrossRefGoogle Scholar
  36. McKhann HI, Camilleri C, Bérard A, Bataillon T, David JL, Reboud X, Le Corre V, Caloustian C, Gut IG, Brunel D (2004) Nested core collections maximizing genetic diversity in Arabidopsis thaliana. Plant J 38:193–202CrossRefGoogle Scholar
  37. Meirmans PG (2012) The trouble with isolation by distance. Mol Ecol 21:2839–2846CrossRefGoogle Scholar
  38. Neph S, Kuehn MS, Reynolds AP, Haugen E, Thurman RE, Johnson AK, Rynes E, Maurano MT, Vierstra J, Thomas S, Sandstrom R, Humbert R, Stamatoyannopoulos JA (2012) BEDOPS: high-performance genomic feature operations. Bioinformatics 28:1919–1920CrossRefGoogle Scholar
  39. Paaby AB, Rockman MV (2014) Cryptic genetic variation: evolution’s hidden substrate. Nat Rev Genet 15:247–258CrossRefGoogle Scholar
  40. Parra-Quijano M, Iriondo JM, Torres E (2012) Improving representativeness of genebank collections through species distribution models, gap analysis and ecogeographical maps. Biodivers Conserv 21:79–96CrossRefGoogle Scholar
  41. Petit RJ, El Mousadik A, Pons O (1998) Identifying populations for conservation on the basis of genetic markers. Conserv Biol 12:844–855CrossRefGoogle Scholar
  42. Reeves PA, Richards CM (2017) Capturing haplotypes in germplasm core collections using bioinformatics. Genet Resour Crop Evol. CrossRefGoogle Scholar
  43. Reeves PA, Panella LW, Richards CM (2012) Retention of agronomically important variation in germplasm core collections: implications for allele mining. Theor Appl Genet 124:1155–1171CrossRefGoogle Scholar
  44. Rousset F (1997) Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics 145:1219–1228PubMedPubMedCentralGoogle Scholar
  45. Ruan Y-L, Llewellyn DJ, Furbank RT (2003) Suppression of sucrose synthase gene expression represses cotton fiber cell initiation, elongation, and seed development. Plant Cell 15:952–964CrossRefGoogle Scholar
  46. Sam LT, Mendonça EA, Li J, Blake J, Friedman C, Lussier YA (2009) PhenoGO: an integrated resource for the multiscale mining of clinical and biological data. BMC Bioinform 10:S8. CrossRefGoogle Scholar
  47. Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78:629–644CrossRefGoogle Scholar
  48. Schoen DJ, Brown AHD (1995) Maximising genetic diversity in core collections of wild relatives of crop species. In: Hodgkin T, Brown AHD, van Hintum TJL, Morales EAV (eds) Core collections of plant genetic resources. Wiley, Chichester, pp 55–76Google Scholar
  49. Sork VL, Nason J, Campbell DR, Fernandez JF (1999) Landscape approaches to historical and contemporary gene flow in plants. Trends Ecol Evol 14:219–224CrossRefGoogle Scholar
  50. Studer A, Zhao Q, Ross-Ibarra J, Doebley J (2011) Identification of a functional transposon insertion in the maize domestication gene tb1. Nat Genet 43:1160–1163CrossRefGoogle Scholar
  51. Taketa S, Amano S, Tsujino Y, Sato T, Saisho D, Kakeda K, Nomura M, Suzuki T, Matsumoto T, Sato K, Kanamori H, Kawasaki S, Takeda K (2008) Barley grain with adhering hulls is controlled by an ERF family transcription factor gene regulating a lipid biosynthesis pathway. Proc Natl Acad Sci 105:4062–4067CrossRefGoogle Scholar
  52. Tanksley SD, McCouch SR (1997) Seed banks and molecular maps: unlocking genetic potential from the wild. Science 22:1063–1066CrossRefGoogle Scholar
  53. The Gene Ontology Consortium (2017) Expansion of the gene ontology knowledgebase and resources. Nucleic Acids Res 45:D331–D338CrossRefGoogle Scholar
  54. Upadhyaya HD, Ortiz R, Bramel PJ, Singh S (2003) Development of a ground nut core collection using taxonomical, geographical and morphological descriptors. Genet Resour Crop Evol 50:139–148CrossRefGoogle Scholar
  55. Vekemans X, Hardy OJ (2004) New insights from fine-scale spatial genetic structure analyses in plant populations. Mol Ecol 13:921–935CrossRefGoogle Scholar
  56. Wall JD, Pritchard JK (2003) Haplotype blocks and linkage dis-equilibrium in the human genome. Nat Rev Genet 4:587–597CrossRefGoogle Scholar
  57. Wang H, Nussbaum-Wagler T, Li B, Zhao Q, Vigouroux Y, Faller M, Bomblies-Yant K, Lukens L, Doebley J (2005) The origin of the naked grains of maize. Nature 436:714–719CrossRefGoogle Scholar
  58. Waples RS (1995) Evolutionarily significant units and the conservation of biological diversity under the endangered species act. Am Fish Soc Symp 17:8–27Google Scholar
  59. Wright S (1943) Isolation by distance. Genetics 28:114–138PubMedPubMedCentralGoogle Scholar

Copyright information

© This is a U.S. government work and its text is not subject to copyright protection in the United States; however, its text may be subject to foreign copyright protection 2018

Authors and Affiliations

  1. 1.United States Department of AgricultureAgricultural Research Service, National Laboratory for Genetic Resources PreservationFort CollinsUSA

Personalised recommendations