Structured priors for sparse probability vectors with application to model selection in Markov chains

  • Matthew HeinerEmail author
  • Athanasios Kottas
  • Stephan Munch


We develop two prior distributions for probability vectors which, in contrast to the popular Dirichlet distribution, retain sparsity properties in the presence of data. Our models are appropriate for count data with many categories, most of which are expected to have negligible probability. Both models are tractable, allowing for efficient posterior sampling and marginalization. Consequently, they can replace the Dirichlet prior in hierarchical models without sacrificing convenient Gibbs sampling schemes. We derive both models and demonstrate their properties. We then illustrate their use for model-based selection with a hierarchical model in which we infer the active lag from time-series data. Using a squared-error loss, we demonstrate the utility of the models for data simulated from a nearly deterministic dynamical system. We also apply the prior models to an ecological time series of Chinook salmon abundance, demonstrating their ability to extract insights into the lag dependence.


Generalized Dirichlet distribution Mixture transition distribution Nonlinear dynamics Sparsity prior Stick-breaking construction 



The work of the first and the second author was supported in part by the National Science Foundation under award DMS 1310438. The authors gratefully acknowledge helpful comments from an editor and two anonymous referees.


  1. Agresti, A., Hitchcock, D.B.: Bayesian inference for categorical data analysis. Stat. Methods Appl. 14(3), 297–330 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  2. Albert, J.H., Gupta, A.K.: Mixtures of Dirichlet distributions and estimation in contingency tables. Ann. Stat. 10, 1261–1268 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  3. Atchison, J., Shen, S.M.: Logistic-normal distributions: some properties and uses. Biometrika 67(2), 261–272 (1980)MathSciNetCrossRefzbMATHGoogle Scholar
  4. Azat, J.: Grandtab.2016.04.11. California Central Valley Chinook Population Database Report. California Department of Fish and Wildlife. (2016). Accessed 9 Feb 2019
  5. Berchtold, A., Raftery, A.E.: The mixture transition distribution model for high-order Markov chains and non-Gaussian time series. Stat. Sci. 17, 328–356 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  6. Besag, J., Mondal, D.: Exact goodness-of-fit tests for Markov chains. Biometrics 69(2), 488–496 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  7. Bezanson, J., Edelman, A., Karpinski, S., Shah, V.B.: Julia: a fresh approach to numerical computing. SIAM Rev. 59(1), 65–98 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  8. Bouguila, N., Ziou, D.: Dirichlet-based probability model applied to human skin detection [image skin detection]. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004. Proceedings.(ICASSP’04). IEEE, vol. 5, pp. V–521 (2004)Google Scholar
  9. Connor, R.J., Mosimann, J.E.: Concepts of independence for proportions with a generalization of the Dirichlet distribution. J. Am. Stat. Assoc. 64(325), 194–206 (1969)MathSciNetCrossRefzbMATHGoogle Scholar
  10. Elfadaly, F.G., Garthwaite, P.H.: Eliciting Dirichlet and Gaussian copula prior distributions for multinomial models. Stat. Comput. 27(2), 449–467 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  11. George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88(423), 881–889 (1993)CrossRefGoogle Scholar
  12. George, E.I., McCulloch, R.E.: Approaches for Bayesian variable selection. Stat. Sin. 7, 339–373 (1997)zbMATHGoogle Scholar
  13. Good, I.J.: On the application of symmetric Dirichlet distributions and their mixtures to contingency tables. Ann. Stat. 4, 1159–1189 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
  14. Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  15. Hamilton, N.: ggtern: An Extension to ‘ggplot2’, for the Creation of Ternary Diagrams. R package version 2.2.1. (2017). Accessed 9 Feb 2019
  16. Hare, S.R., Mantua, N.J.: An historical narrative on the Pacific Decadal Oscillation, interdecadal climate variability and ecosystem impacts. Report of a Talk Presented at the 20th NE Pacific Pink and Chum Workshop, Seattle, WA (2001)Google Scholar
  17. Hjort, N.L.: Bayesian approaches to non- and semiparametric density estimation. Preprint Series Statistical Research Report. (1994). Accessed 9 Feb 2019
  18. Insua, D., Ruggeri, F., Wiper, M.: Bayesian Analysis of Stochastic Process Models, vol. 978. Wiley, New York (2012)CrossRefzbMATHGoogle Scholar
  19. Lochner, R.H.: A generalized Dirichlet distribution in Bayesian life testing. J. R. Stat. Soc. Ser. B Methodol. 37, 103–113 (1975)MathSciNetzbMATHGoogle Scholar
  20. Park, T., Casella, G.: The Bayesian lasso. J. Am. Stat. Assoc. 103(482), 681–686 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  21. Prado, R., West, M.: Time Series: Modeling, Computation, and Inference. CRC Press, Boca Raton (2010)CrossRefzbMATHGoogle Scholar
  22. Quinn, T.J., Deriso, R.B.: Quantitative Fish Dynamics. Oxford University Press, Oxford (1999)Google Scholar
  23. Raftery, A.E.: A model for high-order Markov chains. J. R. Stat. Soc. Ser. B Methodol. 47, 528–539 (1985)MathSciNetzbMATHGoogle Scholar
  24. Raftery, A., Tavaré, S.: Estimation and modelling repeated patterns in high order Markov chains with the mixture transition distribution model. Appl. Stat. 43, 179–199 (1994)CrossRefzbMATHGoogle Scholar
  25. Satterthwaite, W.H., Carlson, S.M., Criss, A.: Ocean size and corresponding life history diversity among the four run timings of California Central Valley Chinook salmon. Trans. Am. Fish. Soc. 146(4), 594–610 (2017)CrossRefGoogle Scholar
  26. Sethuraman, J.: A constructive definition of Dirichlet priors. Stat. Sin. 4, 639–650 (1994)MathSciNetzbMATHGoogle Scholar
  27. Tank, A., Fox, E.B., Shojaie, A.: Granger causality networks for categorical time series (2017). arXiv preprint arXiv:1706.02781
  28. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 58, 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  29. Wong, T.T.: Generalized Dirichlet distribution in Bayesian analysis. Appl. Math. Comput. 97(2–3), 165–181 (1998)MathSciNetzbMATHGoogle Scholar
  30. Zucchini, W., MacDonald, I.L.: Hidden Markov Models for Time Series: An Introduction Using R, vol. 22. CRC press, Boca Raton (2009)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of StatisticsUniversity of CaliforniaSanta CruzUSA
  2. 2.Fisheries Ecology Division, Southwest Fisheries Science Center, National Marine Fisheries ServiceNOAASanta CruzUSA

Personalised recommendations