Predicting Shannon’s information for genes in finite populations: new uses for old equations

  • G. D. O’ReillyEmail author
  • F. Jabot
  • M. R. Gunn
  • W. B. SherwinEmail author
Methods and Resources Article


This study provides predictive equations for Shannon’s information in a finite population, which are intuitive and simple enough to see wide scale use in molecular ecology and population genetics. A comprehensive profile of genetic diversity contains three complementary components: numbers of allelic types, Shannon’s information and heterozygosity. Currently heterozygosity has greater resources than Shannon’s information, such as more predictive models and integration into more mainstream genetics software. However, Shannon’s information has several advantages over heterozygosity as a measure of genetic diversity, so it is important to develop Shannon’s information as a new tool for molecular ecology. Past efforts at making forecasts for Shannon’s information in specific molecular ecology scenarios mostly dealt with expectations for Shannon’s information at genetic equilibrium, but dynamic forecasts are also vital. In particular, we must be able to predict loss of genetic diversity when dealing with finite populations, because they risk losing genetic variability, which can have an adverse effect on their survival. We present equations for predicting loss of genetic diversity measured by Shannon’s information. We also provide statistical justification for these models by assessing their fit to data derived from simulations and managed, replicated laboratory populations. The predictive models will enhance the usefulness of Shannon’s information as a measure of genetic diversity; they will also be useful in pest control and conservation.


Conservation genetics Population genetics Isolated populations Small populations Entropy Simulations 



I would like to thank Zlatko Jovanoski, Mark Tanaka and Russell Bonduriansky for providing helpful criticism and comments throughout the duration of writing this paper. Flydata were obtained from a previous study that was funded by the Australian Research Council Discovery Grant A10007270 to Sherwin, Oakeshott, Barker and Frankham. Jyoutsna Gupta participated significantly in the experiments that lead to Flydata.

Author contributions

GDO did the simulation design, programming and analysis. WBS conceived and supervised the project. FJ developed Jabot’s equation (in the Supplement S5.1) and all theoretical backing for its use. MRG conducted the study that produced the ‘Flydata’.

Supplementary material

12686_2018_1079_MOESM1_ESM.docx (1.3 mb)
Supplementary material 1 (DOCX 1352 KB)


  1. Akaike H (2011) Akaike’s information criterion. In: International encyclopedia of statistical science. Springer, Berlin, pp 25–25. CrossRefGoogle Scholar
  2. Bolton PE, Rollins LA, Brazill-Boast J et al (2017) The colour of paternity: extra-pair paternity in the wild Gouldian finch does not appear to be driven by genetic incompatibility between morphs. J Evol Biol 30(1):174–190. CrossRefPubMedGoogle Scholar
  3. Buddle CM, Beguin J, Bolduc E et al (2005) The importance and use of taxon sampling curves for comparative biodiversity research with forest arthropod assemblages. Can Entomol 137(1):120–127. CrossRefGoogle Scholar
  4. Bulit C, Díaz-Ávalos C, Montagnes DJ (2009) Scaling patterns of plankton diversity: a study of ciliates in a tropical coastal lagoon. Hydrobiologia 624(1):29–44. CrossRefGoogle Scholar
  5. Chao A, Jost L (2015) Estimating diversity and entropy profiles via discovery rates of new species. Methods Ecol Evol 6(8):873–882. CrossRefGoogle Scholar
  6. Chao A, Chiu H, Jost L (2014) Unifying species diversity, phylogenetic diversity, functional diversity, and related similarity and differentiation measures through Hill numbers. Annu Rev Ecol Evol Syst 45:297–324. CrossRefGoogle Scholar
  7. Chao A, Jost L, Hsieh TC et al (2015a) Expected Shannon entropy and Shannon differentiation between subpopulations for neutral genes under the finite island model. PLoS ONE 10(6):e0125471. CrossRefPubMedPubMedCentralGoogle Scholar
  8. Chao A, Ma H, Hsieh C (2015b) The online program SpadeR: species-richness prediction and diversity estimation in R. Program and user’s guide published. Accessed May 2018
  9. Crow J, Kimura M (1970) An introduction to population genetics theory. Harper and Row, New YorkGoogle Scholar
  10. Dewar RC, Sherwin WB, Thomas E, Holleley CE, Nichols RA (2011) Predictions of single-nucleotide polymorphism differentiation between two populations in terms of mutual information. Mol Ecol 20(15):3156–3166. CrossRefPubMedGoogle Scholar
  11. Engen S, Lande R, Saether E (2005) Effective size of a fluctuating age-structured population. Genetics 170:941–954. CrossRefPubMedPubMedCentralGoogle Scholar
  12. Ewens WJ (1972) The sampling theory of selectively neutral alleles. Theor Popul Biol 3(1):87–112. CrossRefPubMedGoogle Scholar
  13. Frankham R (2005) Genetics and extinction. Biol Conserv 126(2):131–140. CrossRefGoogle Scholar
  14. Gaggiotti E, Chao A, Peres-Neto P, Chiu H, Edwards C, Fortin J, Selkoe A (2018) Diversity from genes to ecosystems: a unifying framework to study variation across biological metrics and scales. Evol Appl CrossRefPubMedPubMedCentralGoogle Scholar
  15. Goudie F, Allsopp MH, Oldroyd BP (2014) Selection on overdominant genes maintains heterozygosity along multiple chromosomes in a clonal lineage of honey bee. Evolution 68(1):125–136. CrossRefPubMedGoogle Scholar
  16. Gunn M (2003) The use of microsatellites as a surrogate for quantitative trait variation. Ph.D. Thesis, University of New South Wales, SydneyGoogle Scholar
  17. Halliburton R (2004) Introduction to population genetics. Pearson/Prentice Hall, Upper Saddle RiverGoogle Scholar
  18. Hedrick PW (1985) Genetics of populations. Jones & Bartlett Publishers, BurlingtonGoogle Scholar
  19. Iwasa Y (1988) Free fitness that always increases in evolution. J Theor Biol 135(3):265–281. CrossRefPubMedGoogle Scholar
  20. Jost L (2008) GST and its relatives do not measure differentiation. Mol Ecol 17(18):4015–4026. CrossRefGoogle Scholar
  21. Luikart G, Allendorf FW, Cornuet JM, Sherwin W (1998a) Distortion of allele frequency distributions provides a test for recent population bottlenecks. J Hered 89(3):238–247. CrossRefPubMedGoogle Scholar
  22. Luikart G, Sherwin W, Steele BM, Allendorf FW (1998b) Usefulness of molecular markers for detecting population bottlenecks via monitoring genetic change. Mol Ecol 7(8):963–974. CrossRefPubMedGoogle Scholar
  23. Moore H, Andrews C, Olson S, Carlson E, Larock R, Bulhoes J, Armentrout L (2017) Grid-based stochastic search for hierarchical gene–gene interactions in population-based genetic studies of common human diseases. BioData Min 10(1):19. CrossRefPubMedPubMedCentralGoogle Scholar
  24. Peakall RO, Smouse PE (2006) GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Mol Ecol Resour 6(1):288–295. CrossRefGoogle Scholar
  25. Peng B, Kimmel M (2005) simuPOP: a forward-time population genetics simulation environment. Bioinformatics 21(18):3686–3687. CrossRefPubMedGoogle Scholar
  26. Schlötterer C, Ritter R, Harr B, Brem G (1998) High mutation rate of a long microsatellite allele in Drosophila melanogaster provides evidence for allele-specific mutation rates. Mol Biol Evol 15(10):1269–1274. CrossRefPubMedGoogle Scholar
  27. Schmidt M, Lipson H (2013) Eureqa (version 0.98 beta) [software]. Nutonian, SomervilleGoogle Scholar
  28. Schug M, Mackay T, Aquadro C (1997) Low mutation rates of microsatellite loci in Drosophila melanogaster. Nat Genet 15(1):99–102. CrossRefPubMedGoogle Scholar
  29. Shannon CE (1949) Communication theory of secrecy systems. Bell Labs Tech J 28(4):656–715. CrossRefGoogle Scholar
  30. Sherwin W, Jabot F, Rush R, Rossetto M (2006) Measurement of biological information with applications from genes to landscapes. Mol Ecol 15(10):2857–2869. CrossRefPubMedGoogle Scholar
  31. Sherwin W, Chao A, Jost L, Smouse PE (2017) Information theory broadens the spectrum of molecular ecology and evolution. Trends Ecol Evol 32(12):948–963. CrossRefPubMedGoogle Scholar
  32. Sloane A, Wyner D (1993) Claude Elwood Shannon collected papers. IEEE Press, PiscatawayCrossRefGoogle Scholar
  33. Waples S (2006) A bias correction for estimates of effective population size based on linkage disequilibrium at unlinked gene loci. Conserv Genet 7(2):167. CrossRefGoogle Scholar
  34. Wright S (1931) Evolution in Mendelian populations. Genetics 16(2):97–159PubMedPubMedCentralGoogle Scholar
  35. Yoder B, Stanton-Geddes J, Zhou P, Briskine R, Young D, Tiffin P (2014) Genomic signature of adaptation to climate in Medicago truncatula. Genetics, 113 CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  1. 1.Evolution and Ecology Research Centre, School of Biological Earth and Environmental ScienceUniversity of New South WalesSydneyAustralia
  2. 2.LISC - Laboratoire d’Ingénierie pour les Systèmes ComplexesIRSTEAAubièreFrance
  3. 3.School of English and Media StudiesMassey UniversityAucklandNew Zealand
  4. 4.Department of Engineering ScienceUniversity of AucklandAucklandNew Zealand

Personalised recommendations