Abstract
Consider the random Dirichlet partition of the interval inton fragments with parameter π>0. Explicit results on the statistical structure of its size-biased permutation are recalled, leading to (unordered) Ewens and (ordered) Donnelly-Tavaré-Griffiths sampling formulae from finite Dirichlet partitions. We use these preliminary statistical results on frequencies distribution to address the following sampling problem: what are the intervals between new sampled categories when sampling is from Dirichlet populations? The results obtained are in accordance with the ones found in sampling theory from random proportions with GEM(γ) distribution. These can be obtained from Dirichlet model when considering the Kingman limitn↑∞, π↓0 whilenπ=γ>0.
Similar content being viewed by others
References
Barrera, J., Huillet, T. and Paroissin, C. (2005). Size-biased permutation of Dirichlet partitions and search-cost distribution,Probability in the Engineering & Informational Sciences,19(1), 83–97.
Donnelly, P. (1986). Partition structures, Pòlya urns, the Ewens sampling formula and the age of alleles,Theoretical Population Biology,30, 271–288.
Donnelly, P. (1991). The heaps process, libraries and size-biased permutation,Journal of Applied Probability,28, 321–335.
Donnelly, P. and Tavaré, S. (1986). The age of alleles and a coalescent,Advances in Applied Probability,18, 1–19.
Ewens, W. J. (1972). The sampling theory of selectively neutral alleles,Theoretical Population Biology,3, 87–112.
Ewens, W. J. (1990). Population genetics theory—the past and the future,Mathematical and Statistical Developments of Evolutionary Theory (ed. S. Lessard), Kluwer, Dordrecht.
Ewens, W. J. (1996). Some remarks on the law of succession,Athens Conference on Applied Probability and Time Series Analysis (1995), Vol. I, Lecture Notes in Statistics,114, 229–244, Springer, New York.
Huillet, T. (2003). Sampling problems for randomly broken sticks,Journal of Physics A,36(14), 3947–3960.
Huillet, T. (2005). Sampling formulae arising from random Dirichlet populations,Communications in Statistics: Theory and Methods (to appear).
Huillet, T. and Martinez, S. (2003). Sampling from finite random partitions,Methodology and Computing in Applied Probability,5(4), 467–492.
Kingman, J. F. C. (1975). Random discrete distributions,Journal of the Royal Statistical Society. Series B,37, 1–22.
Kingman, J. F. C. (1993).Poisson Processes, Clarendon Press, Oxford.
Pitman, J. (1996). Random discrete distributions invariant under size-biased permutation,Advances in Applied Probability,28, 525–539.
Pitman, J. (1999). Coalescents with multiple collisions,Annals of Probability,27(4), 1870–1902.
Pitman, J. (2002). Poisson-Dirichlet and GEM invariant distributions for split-and-merge transformation of an interval partition,Combinatorics, Probability and Computing,11(5), 501–514.
Pitman, J. and Yor, M. (1997). The two parameter Poisson-Dirichlet distribution derived from a stable subordinator,Annals of Probability,25, 855–900.
Sibuya, M. and Yamato, H. (1995). Ordered and unordered random partitions of an integer and the GEM distribution,Statistics & Probability Letters,25(2), 177–183.
Tavaré, S. and Ewens, W. J. (1997). Multivariate Ewens distribution,Discrete Multivariate Distributions (eds. N. L. Johnson, S. Kotz and N. Balakrishnan),41, 232–246, Wiley, New York.
Yamato, H. (1997). On the Donnely-Tavaré-Griffiths formula associated with the coalescent,Communications in Statistics: Theory and Methods,26(3), 589–599.
Yamato, H., Sibuya, M. and Nomachi, T. (2001). Ordered sample from two-parameter GEM distribution,Statistics & Probability Letters,55(1), 19–27.
Author information
Authors and Affiliations
About this article
Cite this article
Huillet, T. Unordered and ordered sample from dirichlet distribution. Ann Inst Stat Math 57, 597–616 (2005). https://doi.org/10.1007/BF02509241
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02509241