Abstract
We connect the empirical or ‘occupation’ laws of certain discrete space time-inhomogeneous Markov chains, related to simulated annealing, to a novel class of ‘stick-breaking’ processes, a ‘nonexchangeable’ generalization of the Dirichlet process used in nonparametric Bayesian statistics. To make this unexpected correspondence, we examine an intermediate ‘clumped’ structure in both the time-inhomogeneous Markov chains and the stick-breaking processes, perhaps of its own interest, which records the sequence of different states visited and the scaled proportions of time spent on them. By matching the associated intermediate structures, we identify the limits of the empirical measures of the time-inhomogeneous Markov chains as types of stick-breaking processes.
Similar content being viewed by others
References
Arratia, R., Barbour, A. D. and Tavaré, S. (1999). The Poisson-Dirichlet distribution and the scale-invariant Poisson process. Combin. Probab. Comput. 8, 5, 407–416.
Arratia, R., Barbour, A. D. and Tavaré, S. (2003). Logarithmic combinatorial structures: a probabilistic approach, 1. European Mathematical Society, Zürich.
Bernau, S. J. and Smithies, F. (1963). A note on normal operators. Proc. Comb. Phil. Soc. 59, 727–729.
Blackwell, D. and MacQueen, J. B. (1973). Ferguson distributions via pólya urn schemes. Ann. Statist. 1, 2, 353–355.
Bouguet, F. and Cloez, B. (2018). Fluctuations of the empirical measure of freezing Markov chains. Elec. J. Probab. 23, 1–31.
Bovier, A. and Den Hollander, F. (2016). Metastability: a Potential-Theoretic Approach Grundlehren der mathematischen Wissenschaften, 351. Springer, Berlin.
Broderick, T., Jordan, M. I. and Pitman, J. (2012). Beta processes, stick-breaking and power laws. Bayesian Anal. 7, 2, 439–476.
Catoni, O. and Cerf, R. (1997). The exit path of a Markov chain with rare transitions. ESAIM: P&S, 1, 95–144.
Crane, H. (2016a). Rejoinder: The ubiquitous Ewens sampling formula. Statist. Sci. 31, 1, 37–39.
Crane, H. (2016b). The ubiquitous Ewens sampling formula. Statist. Sci. 31, 1, 1–19.
Diaconis, P., Mayer-Wolf, E., Zeitouni, O. and Zerner, M. P. W. (2004). The Poisson-Dirichlet law is the unique invariant distribution for uniform split-merge transformations. Ann. Probab. 32, 1B, 915–938.
Dietz, Z. and Sethuraman, S. (2007). Occupation laws for some time-nonhomogeneous markov chains. Elec. J. Probab. 12, 661–683.
Dietz, Z., Sethuraman, S. and Lippitt, W. (2019). Stick-breaking processes, clumping, and Markov chain occupation laws. arXiv:1901.08135v1.
Engen, S. (1975). A note on the geometric series as a species frequency model. Biometrika 62, 3, 697–699.
Engländer, J. and Volkov, S. (2018). Turning a coin over instead of tossing it. J. Theoret. Probab. 31, 2, 1097–1118.
Engländer, J., Volkov, S. and Wang, Z. (2020). The coin-turning walk and its scaling limit. Elec. J. Probab. 25, 1–38.
Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Stat., 209–230.
Gantert, N. (1990). Laws of large numbers for the annealing algorithm. Stoch. Proc. Appl. 35, 2, 309–313.
Ghosal, S. and Van der Vaart, A. (2017). Fundamentals of nonparametric Bayesian inference, 44. Cambridge University Press, Cambridge.
Ghosh, J. and Ramamoorthi, R. (2003). Bayesian nonparametrics. Springer, New York.
Gnedin, A. and Kerov, S. (2001). A characterization of GEM distributions. Combin. Probab. Comput. 10, 3, 213–217.
Gnedin, A. and Pitman, J. (2005a). Regenerative composition structures. Ann. Probab. 33, 2, 445–479.
Gnedin, A. and Pitman, J. (2005b). Regenerative partition structures. Elec. J. Combin. 11, 2, Research Paper 12.
Goldie, C. M. (1991). Implicit renewal theory and tails of solutions of random equations. Ann. Appl. Probab. 1, 1, 126–166.
Herbach, U. (2019). Stochastic gene expression with a multistate promoter: breaking down exact distributions. SIAM J. Appl. Math. 79, 1007–1029.
Hjort, N. L., Holmes, C. and Müller, P. (2010). Bayesian nonparametrics, vol. 28. Cambridge University Press, Cambridge, Walker, S. G. (ed.)
Hoppe, F. M. (1987). The sampling theory of neutral alleles and an urn model in population genetics. J. Math. Biol. 25, 2, 123–159.
Kingman, J. F. C. (1975). Random discrete distributions. J. Royal Stat. Soc. Ser. B. Stat. Methodol., 1–22.
Kingman, J. F. C. (1992). Poisson processes. Clarendon Press, Oxford.
Landim, C. (2019). Metastable Markov chains. Probab. Surveys 16, 143–227.
Last, G. (2020). An integral characterization of the Dirichlet process. J. Theoretical Probab. 33, 918–930.
Lavine, M. (1992). Some aspects of Polya tree distributions for statistical modeling. Ann. Stat. 20, 3, 1222–1235.
Lippitt, W. and Sethuraman, S. (2020). On the use of Markov stick-breaking priors: Manuscript in preparation.
McCloskey, J. W. (1965). A model for the distribution of individuals by species in an environment. Ph.D. thesis, Michigan State University, Department of Statistics.
Miclo, L. (1996). Sur let problémes de sortie discrets inhomogénes. Ann. Appl. Probab. 6, 4, 1112–1156.
Müller, P., Quintana, F. A., Jara, A. and Hanson, T. (2015). Bayesian nonparametric data analysis. Springer, New York.
Olivieri, E. and Vares, M. E. (2005). Large deviations and metastability, 100. Cambridge University Press, Cambridge.
Patil, G. and Taillie, C. (1977). Diversity as a concept and its implications for random communities. Bull. Int. Stat. Inst. 47, 497–515.
Pederson (1989). Analysis Now, Revised printing GTM, 118. Springer, New York.
Pitman, J. (1996a). Random discrete distributions invariant under size-biased permutation. Adv. Appl. Probab. 28, 2, 525–539.
Pitman, J. (1996b). Some developments of the Blackwell-MacQueen urn scheme. Statistics, Probability, and Game Theory: Papers in Honor of David Blackwell30, 245–267.
Pitman, J. and Yor, M. (1997). The two-parameter poisson-dirichlet distribution derived from a stable subordinator. Ann. Probab. 25, 855–900.
Pitman, J. (2006c). Combinatorial Stochastic Processes: Ecole d’eté de probabilités de Saint-Flour XXXII-2002. Springer, Berlin.
Pitman, J., Yakubovich, Y. et al. (2018). Ordered and size-biased frequencies in GEM and Gibbs’ models for species sampling. Ann. Appl. Probab. 28, 3, 1793–1820.
Schiavo, L. D. and Lytvynov, E. W. (2017). A Mecke-type characterization of the Dirichlet-Ferguson measure. arXiv:1706.07602.
Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statist. Sinica, 639–650.
Winkler, G. (1995). Image Analysis, Random Fields, and Dynamic Monte Carlo Methods. Springer, Berlin.
Acknowledgments
We thank the referees and editors for their constructive feedback. Research was partially supported by ARO W911NF-18-1-0311, a Simons Foundation Sabbatical grant, and a Daniel Bartlett graduate fellowship.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dietz, Z., Lippitt, W. & Sethuraman, S. Stick-Breaking processes, Clumping, and Markov Chain Occupation Laws. Sankhya A 85, 129–171 (2023). https://doi.org/10.1007/s13171-020-00236-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13171-020-00236-x