Asynchronous Stochastic Variational Inference

  • Saad MohamadEmail author
  • Abdelhamid Bouchachia
  • Moamar Sayed-Mouchaweh
Conference paper
Part of the Proceedings of the International Neural Networks Society book series (INNS, volume 1)


Stochastic variational inference (SVI) employs stochastic optimization to scale up Bayesian computation to massive data. Since SVI is at its core a stochastic gradient-based algorithm, horizontal parallelism can be harnessed to allow larger scale inference. We propose a lock-free parallel implementation for SVI which allows distributed computations over multiple slaves in an asynchronous style. We show that our implementation leads to linear speed-up while guaranteeing an asymptotic ergodic convergence rate \(O(1/\sqrt{T})\) while the number of slaves is bounded by \(\sqrt{T}\) (T is the total number of iterations). The implementation is done in a high-performance computing environment using message passing interface for python (MPI4py). The empirical evaluation shows that our parallel SVI is lossless, performing comparably well to its counterpart serial SVI with linear speed-up.


  1. 1.
    Andrieu, C., De Freitas, N., Doucet, A., Jordan, M.I.: An introduction to MCMC for machine learning. Mach. Learn. 50(1–2), 5–43 (2003)CrossRefGoogle Scholar
  2. 2.
    Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational inference. Found. Trends® Mach. Learn. 1(1–2), 1–305 (2008)zbMATHGoogle Scholar
  3. 3.
    Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. J. Mach. Learn. Res. 14(1), 1303–1347 (2013)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 400–407 (1951)Google Scholar
  5. 5.
    Recht, B., Re, C., Wright, S., Niu, F.: Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: Advances in Neural Information Processing Systems, pp. 693–701 (2011)Google Scholar
  6. 6.
    Agarwal, A., Duchi, J.C.: Distributed delayed stochastic optimization. In: Advances in Neural Information Processing Systems, pp. 873–881 (2011)Google Scholar
  7. 7.
    Zhang, R., Kwok, J.T.: Asynchronous distributed ADMM for consensus optimization. In: ICML, pp. 1701–1709 (2014)Google Scholar
  8. 8.
    Feyzmahdavian, H.R., Aytekin, A., Johansson, M.: An asynchronous mini-batch algorithm for regularized stochastic optimization. IEEE Trans. Autom. Control 61(12), 3740–3754 (2016)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Mania, H., Pan, X., Papailiopoulos, D., Recht, B., Ramchandran, K., Jordan, M.I.: Perturbed iterate analysis for asynchronous stochastic optimization. arXiv preprint arXiv:1507.06970 (2015)
  10. 10.
    Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods, vol. 23. Prentice Hall, Englewood Cliffs (1989)zbMATHGoogle Scholar
  11. 11.
    Lian, X., Huang, Y., Li, Y., Liu, J.: Asynchronous parallel stochastic gradient for nonconvex optimization. In: Advances in Neural Information Processing Systems, pp. 2737–2745 (2015)Google Scholar
  12. 12.
    Raman, P., Zhang, J., Yu, H.-F., Ji, S., Vishwanathan, S.V.N.: Extreme stochastic variational inference: distributed and asynchronous. arXiv preprint arXiv:1605.09499 (2016)
  13. 13.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  14. 14.
    Hoffman, M., Bach, F.R., Blei, D.M.: Online learning for latent dirichlet allocation. In: Advances in Neural Information Processing Systems, pp. 856–864 (2010)Google Scholar
  15. 15.
    Lichman, M.: UCI Machine Learning Repository (2013)Google Scholar
  16. 16.
    Honkela, A., Valpola, H.: On-line variational Bayesian learning. In: 4th International Symposium on Independent Component Analysis and Blind Signal Separation, pp. 803–808 (2003)Google Scholar
  17. 17.
    Broderick, T., Boyd, N., Wibisono, A., Wilson, A.C., Jordan, M.I.: Streaming variational bayes. In: Advances in Neural Information Processing Systems, pp. 1727–1735 (2013)Google Scholar
  18. 18.
    Neiswanger, W., Wang, C., Xing, E.: Embarrassingly parallel variational inference in nonconjugate models. arXiv preprint arXiv:1510.04163 (2015)

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Saad Mohamad
    • 1
    Email author
  • Abdelhamid Bouchachia
    • 1
  • Moamar Sayed-Mouchaweh
    • 2
  1. 1.Department of ComputingBournemouth UniversityPooleUK
  2. 2.Department of Informatics and AutomaticsEcole des MinesDouaiFrance

Personalised recommendations