Abstract
The behaviour of a Pólya-like urn which generates Ewens' sampling formula in population genetics is investigated. Connections are made with work of Watterson and Kingman and to the Poisson-Dirichlet distribution. The order in which novel types occur in the urn is shown to parallel the age distribution of the infinitely many alleles diffusion model and consequences of this property are explored. Finally the urn process is related to Kingman's coalescent with mutation to provide a rigorous basis for this parallel.
Similar content being viewed by others
References
Blackwell, D., MacQueen, J. B.: Ferguson distributions via Pólya urn schemes. Ann. Statist. 1, 353–355 (1973)
Connor, R. J., Mosimann, J. E.: Concepts of independence for proportions with a generalization of the Dirichlet distribution. J. Am. Statist. Assoc. 64, 194–206 (1969)
Donnelly, P., Tavaré, S.: The ages of alleles and a coalescent. Adv. Appl. Probab. 18, 1–19 (1986)
Engen, S.: A note on the geometric series as a species frequency model. Biometrica 62, 694–699 (1975)
Ethier, S. N., Griffiths, R. C.: The infinitely-many-sites model as a measure-valued diffusion. Ann. Probab., to appear (1987)
Ethier, S. N., Kurtz, T. G.: The infintely-many-neutral-alleles diffusion model. Adv. Appl. Probab. 13, 429–452 (1981)
Ethier, S. N., Kurtz, T. G.: Markov processes; characterization and convergence. New York: Wiley 1986
Ewens, W. J.: The sampling theory of selectively neutral alleles. Theor. Popul. Biol. 3, 87–112 (1972)
Ewens, W. J.: Testing for increased mutation rate for neutral alleles. Theor. Popul. Biol. 4, 251–258 (1973)
Ewens, W. J.: Mathematical population genetics. New York Heidelberg Berlin: Springer 1979
Fuerst, P. A., Chakraborty, R., Nei, M.: Statistical studies on protein polymorphism in natural populations. I. Distribution of single locus heterozygosity. Genetics 86, 455–483 (1977)
Good, I. J.: The estimation of probabilities. Cambridge: MIT Press 1965
Griffiths, R. C.: Lines of descent in the diffusion approximation of neutral Wright-Fisher models. Theor. Popul. Biol. 17, 37–50 (1980)
Griffiths, R. C., Li, W.-H.: Simulating allele frequencies in a population and the genetic differentiation of populations under mutation pressure. Theor. Popul. Biol. 23, 19–33 (1983)
Hill, B. M.: Posterior moments of the number of species in a finite population and the posterior probability of finding a new species. J. Am. Statist. Assoc. 74, 668–673 (1979)
Hoppe, F. M.: Pólya-like urns and the Ewens sampling formula. J. Math. Biol. 20, 91–99 (1984)
Hoppe, F. M.: Size-biased filtering of Poisson-Dirichlet samples with an application to partition structures in genetics. J. Appl. Probab. 23, 1008–1012 (1986)
Karlin, S., McGregor, J.: The number of mutant forms maintained in a population. Proc. Fifth. Berk. Symp. Math. Stat. and Prob. II, 415–438 (1967)
Karlin, S., McGregor, J.: Addendum to a paper of W. Ewens. Theor. Popul. Biol. 3, 113–116 (1972)
Kelly, F. P.: Exact results for the Moran neutral allele model. Adv. Appl. Probab. 9, 197–201 (1977)
Kingman, J. F. C.: Random discrete distributions. J. Roy. Statist. Soc. B. 37, 1–22 (1975)
Kingman, J. F. C.: The population structure associated with Ewens' sampling formula. Theor. Popul Biol. 11, 274–283 (1977)
Kingman, J. F. C.: Random partitions in population genetics. Proc. R. Soc. Lond. A. 361, 1–20 (1978)
Kingman, J. F. C.: The mathematics of genetic diversity. CBMS-NSF Regional Conference Series in Applied Mathematics, Vol. 34 Philadelphia, PA S.I.A.M. 1980
Kingman, J. F. C.: On the genealogy of large populations. J. Appl. Prob. 19A, 27–43 (1982a)
Kingman, J. F. C.: The coalescent. Stoch. Proc. Applic. 13, 235–248 (1982b)
Kingman, J. F. C.: Exchangeability and the evolution of large populations. In: Koch, G., Spizzichino, F. (eds.) Exchangeability in probability and statistics. Amsterdam: North Holland 1982c
McCloskey, J.W.: A model for the distribution of individuals by species in an environment. Ph.D. thesis, Michigan State University (1965)
Patil, G. P., Taillie, C.: Diversity as a concept and its implications for random communities. Bull. Int. Stat. Inst. XLVII, 497–515 (1977)
Saunders, I. W., Tavaré, S., Watterson, G. A.: On the genealogy of nested subsamples from a haploid population. Adv. Appl. Prob. 16, 471–491 (1984)
Tavaré, S.: Line-of-descent and genealogical processes, and their applications in population genetics models. Theor. Popul. Biol. 26, 119–164 (1984)
Trajstman, A. C.: On a conjecture of G. A. Watterson. Adv. Appl. Prob. 6, 489–493 (1974)
Watterson, G. A.: Models for logarithmic species abundance distributions. Theor. Popul. Biol. 6, 217–250 (1974)
Watterson, G. A.: The sampling theory of selectively neutral alleles. Adv. Appl. Prob. 6, 463–488 (1974)
Watterson, G. A.: The stationary distribution of the infinitely-many neutral alleles diffusion model. J. Appl. Probab. 13, 639–651 (1976)
Watterson, G. A.: Reversibility and the age of an allele. I. Moran's infinitely-many neutral alleles model. Theor. Popul. Biol. 10, 239–253 (1976)
Watterson, G. A.: Lines of descent and the coalescent. Theor. Popul. Biol. 26, 72–92
Watterson, G. A.: Estimating the divergence time of two species, to appear (1985)
Watterson, G. A., Guess, H. A.: Is the most frequent allele the oldest? Theor. Popul. Biol. 11, 141–160 (1977)
Wilks, S. S.: Mathematical statistics. New York: Wiley 1962
Author information
Authors and Affiliations
Additional information
This research was partially supported by the Sloan Foundation under Grant 85-6-14 and by the National Science Foundation
Rights and permissions
About this article
Cite this article
Hoppe, F.M. The sampling theory of neutral alleles and an urn model in population genetics. J. Math. Biology 25, 123–159 (1987). https://doi.org/10.1007/BF00276386
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF00276386