Advertisement

Distributions of Random Partitions and Their Applications

  • Charalambos A. CharalambidesEmail author
Article

Abstract

Assume that a random sample of size m is selected from a population containing a countable number of classes (subpopulations) of elements (individuals). A partition of the set of sample elements into (unordered) subsets, with each subset containing the elements that belong to same class, induces a random partition of the sample size m, with part sizes {Z 1,Z 2,...,Z N } being positive integer-valued random variables. Alternatively, if N j is the number of different classes that are represented in the sample by j elements, for j=1,2,...,m, then (N 1,N 2,...,N m ) represents the same random partition. The joint and the marginal distributions of (N 1,N 2,...,N m ), as well as the distribution of \(N=\sum^m_{j=1}N_{\!j}\) are of particular interest in statistical inference. From the inference point of view, it is desirable that all the information about the population is contained in (N 1,N 2,...,N m ). This requires that no physical, genetical or other kind of significance is attached to the actual labels of the population classes. In the present paper, combinatorial, probabilistic and compound sampling models are reviewed. Also, sampling models with population classes of random weights (proportions), and in particular the Ewens and Pitman sampling models, on which many publications are devoted, are extensively presented.

Keywords

Combinatorial sampling model Compound sampling model Dirichlet–Poisson distribution Exchangeable random partitions Ewens sampling formula Partition structures Pitman sampling formula Pólya urn model Stirling numbers 

AMS 2000 Subject Classification

Primary 60C05, 62D05 Secondary 05A05, 05A17 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. E. Antoniak, “Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems,” Annals of Statistics vol. 2 pp. 1152–1174, 1974.zbMATHMathSciNetGoogle Scholar
  2. R. Arratia, A. D. Barbour, and S. Tavaré, “Poisson process approximations for the Ewens sampling formula,” Annals of Applied Probability vol. 2 pp. 519–535, 1992.zbMATHMathSciNetGoogle Scholar
  3. D. E. Barton and F. N. David, “Contagious occupancy,” Journal of the Royal Statistical Society, Series B vol. 21 pp. 120–123, 1959a.zbMATHGoogle Scholar
  4. D. E. Barton and F. N. David, “Haemacytometer counts and occupancy theory,” Trabajos de Estadistica vol. 10 pp. 13–18, 1959b.zbMATHMathSciNetGoogle Scholar
  5. T. Cacoullos and Ch. A. Charalambides, “On minimum variance unbiased estimation for truncated binomial and negative binomial distributions,” Annals of the Institute of Statistical Mathematics vol. 27 pp. 235–244, 1975.zbMATHCrossRefMathSciNetGoogle Scholar
  6. Ch. A. Charalambides, “The asymptotic normality of certain combinatorial distributions,” Annals of the Institute of Statistical Mathematics vol. 28 pp. 499–506, 1976.zbMATHCrossRefMathSciNetGoogle Scholar
  7. Ch. A. Charalambides, “On a restricted occupancy model and its applications,” Biometrical Journal vol. 23 pp. 601–610, 1981.zbMATHCrossRefMathSciNetGoogle Scholar
  8. Ch. A. Charalambides, “On restricted and pseudo-contagious occupancy distributions,” Journal of Applied Probability vol. 20 pp. 872–876, 1983.zbMATHCrossRefMathSciNetGoogle Scholar
  9. Ch. A. Charalambides, Enumerative Combinatorics, CRC Press: Boca Raton, FL, 2002.zbMATHGoogle Scholar
  10. Ch. A. Charalambides, Combinatorial Methods in Discrete Distributions, Wiley: Hoboken, NJ, 2005.zbMATHCrossRefGoogle Scholar
  11. A. De Moivre, The Doctrine of Chances, Pearson: London, 1718 (2nd ed. 1738 and 3rd ed. 1756).Google Scholar
  12. P. Donnelly, “Partitions structures, Pólya urns, the Ewens sampling formula, and the ages of alleles,” Theoretical Population Biology vol. 30 pp. 271–288, 1986.zbMATHCrossRefMathSciNetGoogle Scholar
  13. P. Donnelly and G. Grimmett, “On the asymptotic distribution of large prime factors,” Journal of the London Mathematical Society vol. 47 pp. 395–404, 1993.zbMATHCrossRefMathSciNetGoogle Scholar
  14. P. Donnelly and S. Tavaré, “The ages of alleles and a coalescent,” Advances in Applied Probability vol. 18 pp. 1–19, 1986.zbMATHCrossRefMathSciNetGoogle Scholar
  15. S. Engen, Stochastic Abundance Models with Emphasis on Biological Communities and Species Diversity, Chapman & Hall: London, UK, 1978.zbMATHGoogle Scholar
  16. W. J. Ewens, “The sampling theory of selectively neutral alleles,” Theoretical Population Biology vol. 3 pp. 87–112, 1972.CrossRefMathSciNetGoogle Scholar
  17. W. Feller, An Introduction to Probability Theory and its Applications, (vol. 1, 3rd edn) Wiley: New York, 1968.zbMATHGoogle Scholar
  18. C. M. Goldie, “Records, permutations and greatest convex minorants,” Mathematical Proceedings of the Cambridge Philosophical Society vol. 106 pp. 169–177, 1989.zbMATHMathSciNetGoogle Scholar
  19. R. C. Griffiths, “Lines of descent in the diffusion approximation of neutral Wright–Fisher models,” Theoretical Population Biology vol. 17 pp. 37–50, 1980.zbMATHCrossRefMathSciNetGoogle Scholar
  20. J. C. Hansen, “A functional central limit theorem for the Ewens sampling formula,” Journal of Applied Probability vol. 27 pp. 28–43, 1990.zbMATHCrossRefMathSciNetGoogle Scholar
  21. F. M. Hoppe, “Pólya-like urns and the Ewens sampling formula,” Journal of Mathematical Biology vol. 20 pp. 91–99, 1984.zbMATHCrossRefMathSciNetGoogle Scholar
  22. F. M. Hoppe, “Size-biased filtering of Poisson–Dirichlet samples with an application to partition structures in genetics,” Journal of Applied Probability vol. 23 pp. 1008–1012, 1986.zbMATHCrossRefMathSciNetGoogle Scholar
  23. F. M. Hoppe, “The sampling theory of neutral alleles and an urn model in population genetics,” Journal of Mathematical Biology vol. 25 pp. 123–159, 1987.zbMATHMathSciNetGoogle Scholar
  24. N. Hoshino, “Engen’s extended negative binomial model revisited,” Annals of the Institute of Statistical Mathematics vol. 57 pp. 369–387, 2005.zbMATHCrossRefMathSciNetGoogle Scholar
  25. T. Huillet, “Sampling formulae arising from random Dirichlet populations,” Communications in Statistics. Theory and Methods vol. 34 pp. 1019–1040, 2005.zbMATHCrossRefMathSciNetGoogle Scholar
  26. N. L. Johnson and S. Kotz, Urn Models and Their Applications, Wiley: New York, 1977.Google Scholar
  27. N. L. Johnson and S. Kotz, “Developments in discrete distributions, 1969–1980,” International Statistical Review vol. 50 pp. 71–101, 1982.zbMATHMathSciNetCrossRefGoogle Scholar
  28. N. L. Johnson, S. Kotz, and N. Balakrishnan, Discrete Multivariate Distributions, Wiley: New York, 1997.zbMATHGoogle Scholar
  29. N. L. Johnson, S. Kotz, and A. W. Kemp, Univariate Discrete Distributions, (2nd edn) Wiley: New York, 1992.zbMATHGoogle Scholar
  30. P. Joyce, “Partition structures and sufficient statistics,” Journal of Applied Probability vol. 35 pp. 622–632, 1998.zbMATHCrossRefMathSciNetGoogle Scholar
  31. S. Karlin and J. McGregor, “Addendum to a paper of W. Ewens,” Theoretical Population Biology vol. 3 pp. 113–116, 1972.CrossRefMathSciNetGoogle Scholar
  32. F. P. Kelly, “On stochastic population models in genetics,” Journal of Applied Probability vol. 13 pp. 127–131, 1976.CrossRefMathSciNetzbMATHGoogle Scholar
  33. F. P. Kelly, “Exact results for the Moran neutral allele model,” Advances of Applied Probability vol. 9 pp. 197–201, 1977.CrossRefGoogle Scholar
  34. R. Keener, E. Rothman, and N. Starr, “Distributions on partitions,” Annals of Statistics vol. 15 pp. 1466–1481, 1987.zbMATHMathSciNetGoogle Scholar
  35. J. F. C. Kingman, “Random discrete distributions,” Journal of Royal Statistical Society, Series B vol. 37 pp. 1–22, 1975.zbMATHMathSciNetGoogle Scholar
  36. J. F. C. Kingman, “The population structure associated with the Ewens sampling formula,” Theoretical Population Biology vol. 11 pp. 274–283, 1977.CrossRefMathSciNetGoogle Scholar
  37. J. F. C. Kingman, “Random partitions in population genetics,” Proceedings of the Royal Society London, Series A vol. 361 pp. 1–20, 1978a.zbMATHMathSciNetGoogle Scholar
  38. J. F. C. Kingman, “The representation of partition structures,” Journal of the London Mathematical Society vol. 18 pp. 374–380, 1978b.zbMATHCrossRefMathSciNetGoogle Scholar
  39. J. F. C. Kingman, “On the genealogy of large populations,” Journal of Applied Probability vol. 19A pp. 27–43, 1982a.CrossRefMathSciNetGoogle Scholar
  40. J. F. C. Kingman, “The coalescent,” Stochastic Processes and Their Applications vol. 13 pp. 235–248, 1982b.zbMATHCrossRefMathSciNetGoogle Scholar
  41. S. Kotz and N. Balakrishnan, “Advances in urn models during the past two decades.” In N. Balakrishnan (ed.), Advances in Combinatorial Methods and Applications to Probability and Statistics, pp. 203–257, Birkhäuser: Boston, MA, 1997.Google Scholar
  42. M. Koutras, “Non-central Stirling numbers and some applications,” Discrete Mathematics vol. 42 pp. 73–89, 1982.zbMATHCrossRefMathSciNetGoogle Scholar
  43. S. Kullback, “On certain distributions derived from the multinomial distribution,” Annals of Mathematical Statistics vol. 8 pp. 128–144, 1937.Google Scholar
  44. J. W. McGloskey, “A model for the distribution of individuals by species in an environment,” Ph.D. thesis, Michigan State University, 1965.Google Scholar
  45. K. Nishimura and M. Sibuya, “Extended Stirling family of discrete probability distributions,” Communications in Statistics. Theory and Methods vol. 26 pp. 1727–1744, 1997.zbMATHMathSciNetGoogle Scholar
  46. G. P. Patil and S. Bildikar, “On minimum variance unbiased estimation for the logarithmic series distribution,” Sankyā, Series A vol. 28 pp. 239–250, 1966.zbMATHMathSciNetGoogle Scholar
  47. G. P. Patil and C. Taillie, “Diversity as a concept and its applications for random communities,” Bulletin of the International Statistical Institute vol. XLVII pp. 497–515, 1977.MathSciNetGoogle Scholar
  48. G. P. Patil and J. K. Wani, “On certain structural properties of the logarithmic series distribution and the first type Stirling distribution,” Sankyā, Series A vol. 27 pp. 271–280, 1965.zbMATHMathSciNetGoogle Scholar
  49. M. Perman, J. Pitman, and M. Yor, “Size-biased sampling of Poisson point processes and excursions,” Probability Theory and Related Fields vol. 92 pp. 21–39, 1992.zbMATHCrossRefMathSciNetGoogle Scholar
  50. J. Pitman, “Exchangeable and partially exchangeable random partitions,” Probability Theory and Related Fields vol. 102 pp. 145–158, 1995.zbMATHCrossRefMathSciNetGoogle Scholar
  51. J. Pitman, “Random discrete distributions invariant under size-biased permutation,” Advances in Applied Probability vol. 28 pp. 525–539, 1996.zbMATHCrossRefMathSciNetGoogle Scholar
  52. J. Pitman and M. Yor, “The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator,” Annals of Probability vol. 25 pp. 855–900, 1997.zbMATHCrossRefMathSciNetGoogle Scholar
  53. G. B. Price, “Distributions derived from the multinomial expansion,” American Mathematical Monthly vol. 53 pp. 59–74, 1946.zbMATHCrossRefMathSciNetGoogle Scholar
  54. V. Romanovsky, “Su due problemi di distribuzione casuale,” Giornalle dell’ Istituto Italiano degli Attuari vol. 5 pp. 196–218, 1934.Google Scholar
  55. M. Sibuya, “A random clustering process,” Annals of the Institute of Statistical Mathematics vol. 45 pp. 459–465, 1993.zbMATHCrossRefMathSciNetGoogle Scholar
  56. M. Sibuya and H. Yamato, “Ordered and unordered random partitions of an integer and the GEM distribution,” Statistics & Probability Letters vol. 25 177–183, 1995.zbMATHCrossRefMathSciNetGoogle Scholar
  57. F. M. Steward, “Variability in the amount of heterozygosity maintained by neutral mutations,” Theoretical Population Biology vol. 9 pp. 188–201, 1976.CrossRefMathSciNetGoogle Scholar
  58. A. C. Trajstman, “On a conjecture of G. A. Watterson,” Advances in Applied Probability vol. 6 pp. 489–493, 1974.zbMATHCrossRefMathSciNetGoogle Scholar
  59. G. Trieb, “A Pólya urn model and the coalescent,” Journal of Applied Probability vol. 29 pp. 1–10, 1992.zbMATHCrossRefMathSciNetGoogle Scholar
  60. G. A. Watterson, “Models for the logarithmic species abudance distributions,” Theoretical Population Biology vol. 6 pp. 217–250, 1974a.CrossRefMathSciNetGoogle Scholar
  61. G. A. Watterson, “The sampling theory of selectively neutral alleles,” Advances in Applied Probability vol. 6 pp. 463–488, 1974b.zbMATHCrossRefMathSciNetGoogle Scholar
  62. G. A. Watterson, “The stationary distribution of the infinitely-many neutral alleles diffusion model,” Journal of Applied Probability vol. 13 pp. 639–651, 1976.zbMATHCrossRefMathSciNetGoogle Scholar
  63. H. Yamato, “A Pólya urn model with a continuum of colours,” Annals of the Institute of Statistical Mathematics vol. 45 pp. 453–458, 1993.zbMATHCrossRefMathSciNetGoogle Scholar
  64. H. Yamato and M. Sibuya, “Moments of some statistics of Pitman sampling formula,” Bulletin of Informatics and Cybernetics vol. 32 pp. 1–10, 2000.zbMATHMathSciNetGoogle Scholar
  65. H. Yamato, M. Sibuya, and T. Nomachi, “Ordered sample from two-parameter GEM distribution,” Statistics & Probability Letters vol. 55 pp. 19–27, 2001.zbMATHCrossRefMathSciNetGoogle Scholar
  66. J. E. Young, “Binary sequential representations of random partitions,” Bernoulli vol. 11 pp. 847–861, 2005.zbMATHMathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.Department of MathematicsUniversity of AthensAthensGreece

Personalised recommendations