Abstract
This study provides predictive equations for Shannon’s information in a finite population, which are intuitive and simple enough to see wide scale use in molecular ecology and population genetics. A comprehensive profile of genetic diversity contains three complementary components: numbers of allelic types, Shannon’s information and heterozygosity. Currently heterozygosity has greater resources than Shannon’s information, such as more predictive models and integration into more mainstream genetics software. However, Shannon’s information has several advantages over heterozygosity as a measure of genetic diversity, so it is important to develop Shannon’s information as a new tool for molecular ecology. Past efforts at making forecasts for Shannon’s information in specific molecular ecology scenarios mostly dealt with expectations for Shannon’s information at genetic equilibrium, but dynamic forecasts are also vital. In particular, we must be able to predict loss of genetic diversity when dealing with finite populations, because they risk losing genetic variability, which can have an adverse effect on their survival. We present equations for predicting loss of genetic diversity measured by Shannon’s information. We also provide statistical justification for these models by assessing their fit to data derived from simulations and managed, replicated laboratory populations. The predictive models will enhance the usefulness of Shannon’s information as a measure of genetic diversity; they will also be useful in pest control and conservation.
Similar content being viewed by others
Data availability
Simdata, Flydata, the Python code used to generate simulated data and an example output from that code will be uploaded to a publicly available repository.
References
Akaike H (2011) Akaike’s information criterion. In: International encyclopedia of statistical science. Springer, Berlin, pp 25–25. https://doi.org/10.1007/978-3-642-04898-2_110
Bolton PE, Rollins LA, Brazill-Boast J et al (2017) The colour of paternity: extra-pair paternity in the wild Gouldian finch does not appear to be driven by genetic incompatibility between morphs. J Evol Biol 30(1):174–190. https://doi.org/10.1111/jeb.12997
Buddle CM, Beguin J, Bolduc E et al (2005) The importance and use of taxon sampling curves for comparative biodiversity research with forest arthropod assemblages. Can Entomol 137(1):120–127. https://doi.org/10.4039/n04-040
Bulit C, Díaz-Ávalos C, Montagnes DJ (2009) Scaling patterns of plankton diversity: a study of ciliates in a tropical coastal lagoon. Hydrobiologia 624(1):29–44. https://doi.org/10.1007/s10750-008-9664-x
Chao A, Jost L (2015) Estimating diversity and entropy profiles via discovery rates of new species. Methods Ecol Evol 6(8):873–882. https://doi.org/10.1111/2041-210X.12349
Chao A, Chiu H, Jost L (2014) Unifying species diversity, phylogenetic diversity, functional diversity, and related similarity and differentiation measures through Hill numbers. Annu Rev Ecol Evol Syst 45:297–324. https://doi.org/10.1146/annurev-ecolsys-120213-091540
Chao A, Jost L, Hsieh TC et al (2015a) Expected Shannon entropy and Shannon differentiation between subpopulations for neutral genes under the finite island model. PLoS ONE 10(6):e0125471. https://doi.org/10.1371/journal.pone.0125471
Chao A, Ma H, Hsieh C (2015b) The online program SpadeR: species-richness prediction and diversity estimation in R. Program and user’s guide published. http://chao.stat.nthu.edu.tw/blog/software-download. Accessed May 2018
Crow J, Kimura M (1970) An introduction to population genetics theory. Harper and Row, New York
Dewar RC, Sherwin WB, Thomas E, Holleley CE, Nichols RA (2011) Predictions of single-nucleotide polymorphism differentiation between two populations in terms of mutual information. Mol Ecol 20(15):3156–3166. https://doi.org/10.1111/j.1365-294X.2011.05171.x
Engen S, Lande R, Saether E (2005) Effective size of a fluctuating age-structured population. Genetics 170:941–954. https://doi.org/10.1534/genetics.104.028233
Ewens WJ (1972) The sampling theory of selectively neutral alleles. Theor Popul Biol 3(1):87–112. https://doi.org/10.1016/0040-5809(72)90035-4
Frankham R (2005) Genetics and extinction. Biol Conserv 126(2):131–140. https://doi.org/10.1016/j.biocon.2005.05.002
Gaggiotti E, Chao A, Peres-Neto P, Chiu H, Edwards C, Fortin J, Selkoe A (2018) Diversity from genes to ecosystems: a unifying framework to study variation across biological metrics and scales. Evol Appl https://doi.org/10.1111/eva.12593
Goudie F, Allsopp MH, Oldroyd BP (2014) Selection on overdominant genes maintains heterozygosity along multiple chromosomes in a clonal lineage of honey bee. Evolution 68(1):125–136. https://doi.org/10.1111/evo.12231
Gunn M (2003) The use of microsatellites as a surrogate for quantitative trait variation. Ph.D. Thesis, University of New South Wales, Sydney
Halliburton R (2004) Introduction to population genetics. Pearson/Prentice Hall, Upper Saddle River
Hedrick PW (1985) Genetics of populations. Jones & Bartlett Publishers, Burlington
Iwasa Y (1988) Free fitness that always increases in evolution. J Theor Biol 135(3):265–281. https://doi.org/10.1016/S0022-5193(88)80243-1
Jost L (2008) GST and its relatives do not measure differentiation. Mol Ecol 17(18):4015–4026. https://doi.org/10.1111/j.1365-294X.2008.03887.x
Luikart G, Allendorf FW, Cornuet JM, Sherwin W (1998a) Distortion of allele frequency distributions provides a test for recent population bottlenecks. J Hered 89(3):238–247. https://doi.org/10.1093/jhered/89.3.238
Luikart G, Sherwin W, Steele BM, Allendorf FW (1998b) Usefulness of molecular markers for detecting population bottlenecks via monitoring genetic change. Mol Ecol 7(8):963–974. https://doi.org/10.1046/j.1365-294x.1998.00414.x
Moore H, Andrews C, Olson S, Carlson E, Larock R, Bulhoes J, Armentrout L (2017) Grid-based stochastic search for hierarchical gene–gene interactions in population-based genetic studies of common human diseases. BioData Min 10(1):19. https://doi.org/10.1186/s13040-017-0139-3
Peakall RO, Smouse PE (2006) GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Mol Ecol Resour 6(1):288–295. https://doi.org/10.1111/j.1471-8286.2005.01155.x
Peng B, Kimmel M (2005) simuPOP: a forward-time population genetics simulation environment. Bioinformatics 21(18):3686–3687. https://doi.org/10.1093/bioinformatics/bti584
Schlötterer C, Ritter R, Harr B, Brem G (1998) High mutation rate of a long microsatellite allele in Drosophila melanogaster provides evidence for allele-specific mutation rates. Mol Biol Evol 15(10):1269–1274. https://doi.org/10.1093/oxfordjournals.molbev.a025855
Schmidt M, Lipson H (2013) Eureqa (version 0.98 beta) [software]. Nutonian, Somerville
Schug M, Mackay T, Aquadro C (1997) Low mutation rates of microsatellite loci in Drosophila melanogaster. Nat Genet 15(1):99–102. https://doi.org/10.1038/ng0197-99
Shannon CE (1949) Communication theory of secrecy systems. Bell Labs Tech J 28(4):656–715. https://doi.org/10.1002/j.1538-7305.1949.tb00928.x
Sherwin W, Jabot F, Rush R, Rossetto M (2006) Measurement of biological information with applications from genes to landscapes. Mol Ecol 15(10):2857–2869. https://doi.org/10.1111/j.1365-294X.2006.02992.x
Sherwin W, Chao A, Jost L, Smouse PE (2017) Information theory broadens the spectrum of molecular ecology and evolution. Trends Ecol Evol 32(12):948–963. https://doi.org/10.1016/j.tree.2017.09.012
Sloane A, Wyner D (1993) Claude Elwood Shannon collected papers. IEEE Press, Piscataway
Waples S (2006) A bias correction for estimates of effective population size based on linkage disequilibrium at unlinked gene loci. Conserv Genet 7(2):167. https://doi.org/10.1007/s10592-005-9100-y
Wright S (1931) Evolution in Mendelian populations. Genetics 16(2):97–159
Yoder B, Stanton-Geddes J, Zhou P, Briskine R, Young D, Tiffin P (2014) Genomic signature of adaptation to climate in Medicago truncatula. Genetics, 113 https://doi.org/10.1534/genetics.113.159319
Acknowledgements
I would like to thank Zlatko Jovanoski, Mark Tanaka and Russell Bonduriansky for providing helpful criticism and comments throughout the duration of writing this paper. Flydata were obtained from a previous study that was funded by the Australian Research Council Discovery Grant A10007270 to Sherwin, Oakeshott, Barker and Frankham. Jyoutsna Gupta participated significantly in the experiments that lead to Flydata.
Author information
Authors and Affiliations
Contributions
GDO did the simulation design, programming and analysis. WBS conceived and supervised the project. FJ developed Jabot’s equation (in the Supplement S5.1) and all theoretical backing for its use. MRG conducted the study that produced the ‘Flydata’.
Corresponding authors
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
O’Reilly, G.D., Jabot, F., Gunn, M.R. et al. Predicting Shannon’s information for genes in finite populations: new uses for old equations. Conservation Genet Resour 12, 245–255 (2020). https://doi.org/10.1007/s12686-018-1079-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12686-018-1079-z