Abstract
Inference of latent feature models in the Bayesian nonparametric setting is generally difficult, especially in high dimensional settings, because it usually requires proposing features from some prior distribution. In special cases, where the integration is tractable, we can sample new feature assignments according to a predictive likelihood. We present a novel method to accelerate the mixing of latent variable model inference by proposing feature locations based on the data, as opposed to the prior. First, we introduce an accelerated feature proposal mechanism that we show is a valid MCMC algorithm for posterior inference. Next, we propose an approximate inference strategy to perform accelerated inference in parallel. A two-stage algorithm that combines the two approaches provides a computationally attractive method that can quickly reach local convergence to the posterior distribution of our model, while allowing us to exploit parallelization.
This is a preview of subscription content, access via your institution.







Notes
References
Aldous, D. J.: Exchangeability and related topics. In École d’Été de probabilités de Saint-Flour XIII, (1985)
Antoniak, C.E.: Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann. Statist. 2(6), 1152–1174 (1974)
Au, S.-K., Beck, J.L.: Estimation of small failure probabilities in high dimensions by subset simulation. Probab. Eng. Mech. 16(4), 263–277 (2001)
Blei, D.M., Jordan, M.I.: Variational inference for Dirichlet process mixtures. Bayesian Anal. 1(1), 121–143 (2006)
Broderick, T., Kulis, B., Jordan, M.: MAD-Bayes: MAP-based asymptotic derivations from Bayes. In Dasgupta, S. and McAllester, D., editors, Proceedings of the 30th International Conference on Machine Learning, volume 28 of Proceedings of Machine Learning Research, pages 226–234, Atlanta, Georgia, USA. PMLR, (2013)
Chang, J., Fisher III, J. W.: Parallel sampling of DP mixture models using sub-cluster splits. In Advances in Neural Information Processing Systems, p 620–628, (2013)
Dahl, D.B.: Sequentially-allocated merge-split sampler for conjugate and nonconjugate Dirichlet process mixture models. J. Comput. Graph. Statist. 11(1), 6 (2005)
Damien, P., Wakefield, J., Walker, S.: Gibbs sampling for Bayesian non-conjugate and hierarchical models by using auxiliary variables. J. R. Stat. Soc. Series B (Statistical Methodology) 61(2), 331–344 (1999)
Dubey, A., Zhang, M.M., Xing, E.P., Williamson, S.A.: Distributed, partially collapsed MCMC for Bayesian nonparametrics. Int. Conf. Artif. Intell. Stat. 108, 3685–3695 (2020)
Ferguson, T.S.: A Bayesian analysis of some nonparametric problems. Ann. Statist. 1(2), 209–230 (1973)
Fox, E.B., Hughes, M.C., Sudderth, E.B., Jordan, M.I., et al.: Joint modeling of multiple time series via the beta process with application to motion capture segmentation. The Ann. Appl. Stat. 8(3), 1281–1313 (2014)
Ge, H., Chen, Y., Wan, M., Ghahramani, Z.: Distributed inference for Dirichlet process mixture models. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), 2276–2284, (2015)
Gelman, A., Shalizi, C.R.: Philosophy and the practice of Bayesian statistics. British J. Math. Stat. Psych. 66(1), 8–38 (2013)
Ghosh, J.K., Ramamoorthi, R.V.: Bayesian Nonparametrics. Springer, New York, NY (2003)
Green, P.J., Hastie, D.I.: Reversible jump mcmc. Genetics 155(3), 1391–1403 (2009)
Griffiths, T.L., Ghahramani, Z.: The Indian buffet process: An introduction and review. J. Mach. Learn. Res. 12, 1185–1224 (2011)
Hjort, N. L., Holmes, C., Müller, P., Walker, S. G.: Bayesian nonparametrics, volume 28. Cambridge University Press, (2010)
Hughes, M. C., Sudderth, E.: Memoized online variational inference for Dirichlet process mixture models. In Advances in Neural Information Processing Systems, p 1133–1141, (2013)
Ishwaran, H., James, L.F.: Gibbs sampling methods for stick-breaking priors. JASA 96(453), 161–173 (2001)
Ishwaran, H., Zarepour, M.: Exact and approximate sum representations for the Dirichlet process. Canad J. Stat. 30(2), 269–283 (2002)
Jain, S., Neal, R.M.: A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model. J. Comput. Graph. Stat. 13(1), 158–182 (2004)
Jain, S., Neal, R.M.: Splitting and merging components of a nonconjugate Dirichlet process mixture model. Bayesian Anal. 2(3), 445–472 (2007)
Jordan, M.I.: The era of big data. ISBA Bulletin 18(2), 1–3 (2011)
Katafygiotis, L.S., Zuev, K.M.: Geometric insight into the challenges of solving high-dimensional reliability problems. Probab. Eng. Mech. 23(2–3), 208–218 (2008)
Kim, B., Shah, J. A., Doshi-Velez, F.: Mind the gap: A generative approach to interpretable feature selection and extraction. In Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R., editors, Advances in Neural Information Processing Systems 28, 2251–2259. Curran Associates, Inc., (2015)
Kingma, D. P., Welling, M.: Auto-encoding variational Bayes. ICLR 2014, (2014)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report, University of Toronto, (2009)
LeCun, Y., Cortes, C.: The MNIST database of handwritten digits, (1998)
Lee, K.-C., Ho, J., Kriegman, D.J.: Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans. Pattern Anal. Mach. intelligence 27(5), 684–698 (2005)
Liu, J.S.: The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem. J. Am. Stat. Assoc. 89(427), 958–966 (1994)
Mescheder, L. M., Nowozin, S., Geiger, A.: Adversarial variational Bayes: Unifying variational autoencoders and generative adversarial networks, (2017). CoRR, arXiv:1701.04722
Miller, J. W., Harrison, M. T.: A simple example of Dirichlet process mixture inconsistency for the number of components. In Advances in Neural Information Processing Systems, p 199–206, (2013)
Müller, P., Quintana, F.A., Jara, A., Hanson, T.: Bayesian nonparametric data analysis. Springer, Cham (2015)
Murray, I., Adams, R.P., MacKay, D.J.: Elliptical slice sampling. J. Mach. Lear. Res. 9, 541–548 (2010)
Neal, R.M.: Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 9(2), 249–265 (2000)
Newman, D., Asuncion, A., Smyth, P., Welling, M.: Distributed algorithms for topic models. Journal of Machine Learning Research, 10(8), (2009)
Papamarkou, T., Hinkle, J., Young, M. T., Womble, D.: Challenges in markov chain monte carlo for Bayesian neural networks, (2019). arXiv preprint arXiv:1910.06539
Papaspiliopoulos, O., Roberts, G.O.: Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika 95(1), 169–186 (2008)
Sethuraman, J.: A constructive definition of Dirichlet priors. Statistica Sinica, pp 639–650, (1994)
Smyth, P., Welling, M., Asuncion, A. U.: Asynchronous distributed learning of topic models. In Advances in Neural Information Processing Systems, pp 81–88, (2009)
Teh, Y. W., Jordan, M. I., Beal, M. J., Blei, D. M.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc., 101, (2004)
Tran, D., Ranganath, R., Blei, D. M.: Deep and hierarchical implicit models, (2017). CoRR, arXiv:1702.08896
Ueda, N., Ghahramani, Z.: Bayesian model search for mixture models based on optimizing variational bounds. Neural Netw. 15(10), 1223–1241 (2002)
Walker, S.G.: Sampling the Dirichlet mixture model with slices. Communications in Statistics—Simulation and Computation® 36(1), 45–54 (2007)
West, M.: Hyperparameter estimation in Dirichlet process mixture models, (1992)
Williamson, S. A., Dubey, A., Xing, E.: Parallel Markov chain Monte Carlo for nonparametric mixture models. In Proceedings of the 30th International Conference on Machine Learning, pp 98–106, (2013)
Xuan, J., Lu, J., Zhang, G.: A survey on Bayesian nonparametric learning. ACM Computing Surveys (CSUR) 52(1), 1–36 (2019)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The contribution of Michael Zhang and Sinead Williamson was funded by NSF grant IIS 1447721. The contribution of Michael Zhang was also funded by the University of Hong Kong’s Seed Fund for Basic Research for New Staff.
Rights and permissions
About this article
Cite this article
Zhang, M.M., Williamson, S.A. & Pérez-Cruz, F. Accelerated parallel non-conjugate sampling for Bayesian non-parametric models. Stat Comput 32, 50 (2022). https://doi.org/10.1007/s11222-022-10108-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-022-10108-z
Keywords
- Machine learning
- Bayesian Non-parametrics
- Scalable inference
- Parallel computing