# Structured priors for sparse probability vectors with application to model selection in Markov chains

## Abstract

We develop two prior distributions for probability vectors which, in contrast to the popular Dirichlet distribution, retain sparsity properties in the presence of data. Our models are appropriate for count data with many categories, most of which are expected to have negligible probability. Both models are tractable, allowing for efficient posterior sampling and marginalization. Consequently, they can replace the Dirichlet prior in hierarchical models without sacrificing convenient Gibbs sampling schemes. We derive both models and demonstrate their properties. We then illustrate their use for model-based selection with a hierarchical model in which we infer the active lag from time-series data. Using a squared-error loss, we demonstrate the utility of the models for data simulated from a nearly deterministic dynamical system. We also apply the prior models to an ecological time series of Chinook salmon abundance, demonstrating their ability to extract insights into the lag dependence.

## Keywords

Generalized Dirichlet distribution Mixture transition distribution Nonlinear dynamics Sparsity prior Stick-breaking construction## Notes

### Acknowledgements

The work of the first and the second author was supported in part by the National Science Foundation under award DMS 1310438. The authors gratefully acknowledge helpful comments from an editor and two anonymous referees.

## References

- Agresti, A., Hitchcock, D.B.: Bayesian inference for categorical data analysis. Stat. Methods Appl.
**14**(3), 297–330 (2005)MathSciNetCrossRefzbMATHGoogle Scholar - Albert, J.H., Gupta, A.K.: Mixtures of Dirichlet distributions and estimation in contingency tables. Ann. Stat.
**10**, 1261–1268 (1982)MathSciNetCrossRefzbMATHGoogle Scholar - Atchison, J., Shen, S.M.: Logistic-normal distributions: some properties and uses. Biometrika
**67**(2), 261–272 (1980)MathSciNetCrossRefzbMATHGoogle Scholar - Azat, J.: Grandtab.2016.04.11. California Central Valley Chinook Population Database Report. California Department of Fish and Wildlife. http://www.calfish.org/ProgramsData/Species/CDFWAnadromousResourceAssessment.aspx (2016). Accessed 9 Feb 2019
- Berchtold, A., Raftery, A.E.: The mixture transition distribution model for high-order Markov chains and non-Gaussian time series. Stat. Sci.
**17**, 328–356 (2002)MathSciNetCrossRefzbMATHGoogle Scholar - Besag, J., Mondal, D.: Exact goodness-of-fit tests for Markov chains. Biometrics
**69**(2), 488–496 (2013)MathSciNetCrossRefzbMATHGoogle Scholar - Bezanson, J., Edelman, A., Karpinski, S., Shah, V.B.: Julia: a fresh approach to numerical computing. SIAM Rev.
**59**(1), 65–98 (2017)MathSciNetCrossRefzbMATHGoogle Scholar - Bouguila, N., Ziou, D.: Dirichlet-based probability model applied to human skin detection [image skin detection]. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004. Proceedings.(ICASSP’04). IEEE, vol. 5, pp. V–521 (2004)Google Scholar
- Connor, R.J., Mosimann, J.E.: Concepts of independence for proportions with a generalization of the Dirichlet distribution. J. Am. Stat. Assoc.
**64**(325), 194–206 (1969)MathSciNetCrossRefzbMATHGoogle Scholar - Elfadaly, F.G., Garthwaite, P.H.: Eliciting Dirichlet and Gaussian copula prior distributions for multinomial models. Stat. Comput.
**27**(2), 449–467 (2017)MathSciNetCrossRefzbMATHGoogle Scholar - George, E.I., McCulloch, R.E.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc.
**88**(423), 881–889 (1993)CrossRefGoogle Scholar - George, E.I., McCulloch, R.E.: Approaches for Bayesian variable selection. Stat. Sin.
**7**, 339–373 (1997)zbMATHGoogle Scholar - Good, I.J.: On the application of symmetric Dirichlet distributions and their mixtures to contingency tables. Ann. Stat.
**4**, 1159–1189 (1976)MathSciNetCrossRefzbMATHGoogle Scholar - Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika
**82**, 711–732 (1995)MathSciNetCrossRefzbMATHGoogle Scholar - Hamilton, N.: ggtern: An Extension to ‘ggplot2’, for the Creation of Ternary Diagrams. R package version 2.2.1. https://CRAN.R-project.org/package=ggtern (2017). Accessed 9 Feb 2019
- Hare, S.R., Mantua, N.J.: An historical narrative on the Pacific Decadal Oscillation, interdecadal climate variability and ecosystem impacts. Report of a Talk Presented at the 20th NE Pacific Pink and Chum Workshop, Seattle, WA (2001)Google Scholar
- Hjort, N.L.: Bayesian approaches to non- and semiparametric density estimation. Preprint Series Statistical Research Report. http://urn.nb.no/URN:NBN:no-51774 (1994). Accessed 9 Feb 2019
- Insua, D., Ruggeri, F., Wiper, M.: Bayesian Analysis of Stochastic Process Models, vol. 978. Wiley, New York (2012)CrossRefzbMATHGoogle Scholar
- Lochner, R.H.: A generalized Dirichlet distribution in Bayesian life testing. J. R. Stat. Soc. Ser. B Methodol.
**37**, 103–113 (1975)MathSciNetzbMATHGoogle Scholar - Park, T., Casella, G.: The Bayesian lasso. J. Am. Stat. Assoc.
**103**(482), 681–686 (2008)MathSciNetCrossRefzbMATHGoogle Scholar - Prado, R., West, M.: Time Series: Modeling, Computation, and Inference. CRC Press, Boca Raton (2010)CrossRefzbMATHGoogle Scholar
- Quinn, T.J., Deriso, R.B.: Quantitative Fish Dynamics. Oxford University Press, Oxford (1999)Google Scholar
- Raftery, A.E.: A model for high-order Markov chains. J. R. Stat. Soc. Ser. B Methodol.
**47**, 528–539 (1985)MathSciNetzbMATHGoogle Scholar - Raftery, A., Tavaré, S.: Estimation and modelling repeated patterns in high order Markov chains with the mixture transition distribution model. Appl. Stat.
**43**, 179–199 (1994)CrossRefzbMATHGoogle Scholar - Satterthwaite, W.H., Carlson, S.M., Criss, A.: Ocean size and corresponding life history diversity among the four run timings of California Central Valley Chinook salmon. Trans. Am. Fish. Soc.
**146**(4), 594–610 (2017)CrossRefGoogle Scholar - Sethuraman, J.: A constructive definition of Dirichlet priors. Stat. Sin.
**4**, 639–650 (1994)MathSciNetzbMATHGoogle Scholar - Tank, A., Fox, E.B., Shojaie, A.: Granger causality networks for categorical time series (2017). arXiv preprint arXiv:1706.02781
- Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol.
**58**, 267–288 (1996)MathSciNetzbMATHGoogle Scholar - Wong, T.T.: Generalized Dirichlet distribution in Bayesian analysis. Appl. Math. Comput.
**97**(2–3), 165–181 (1998)MathSciNetzbMATHGoogle Scholar - Zucchini, W., MacDonald, I.L.: Hidden Markov Models for Time Series: An Introduction Using R, vol. 22. CRC press, Boca Raton (2009)CrossRefzbMATHGoogle Scholar