Fast Variational Inference for Gaussian Process Models Through KL-Correction

  • Nathaniel J. King
  • Neil D. Lawrence
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4212)


Variational inference is a flexible approach to solving problems of intractability in Bayesian models. Unfortunately the convergence of variational methods is often slow. We review a recently suggested variational approach for approximate inference in Gaussian process (GP) models and show how convergence may be dramatically improved through the use of a positive correction term to the standard variational bound. We refer to the modified bound as a KL-corrected bound. The KL-corrected bound is a lower bound on the true likelihood, but an upper bound on the original variational bound. Timing comparisons between optimisation of the two bounds show that optimisation of the new bound consistently improves the speed of convergence.


Gaussian Process Noise Model Marginal Likelihood Kernel Parameter Variational Inference 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Girolami, M., Rogers, S.: Variational bayesian multinomial probit regression with gaussian process priors. Neural Computation 18(8), 1790–1817 (2006)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    King, N.J., Lawrence, N.D.: Variational inference in Gaussian processes via probabilistic point assimilation. Technical Report CS-05-06, The University of Sheffield, Department of Computer Science (2005)Google Scholar
  3. 3.
    O’Hagan, A.: Some Bayesian numerical analysis. In: Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. (eds.) Bayesian Statistics, vol. 4, pp. 345–363. Oxford University Press, Oxford (1992)Google Scholar
  4. 4.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)zbMATHGoogle Scholar
  5. 5.
    Waterhouse, S., MacKay, D.J.C., Robinson, T.: Bayesian methods for mixtures of experts. In: Touretzky, D., Mozer, M., Hasselmo, M. (eds.) Advances in Neural Information Processing Systems, vol. 8, pp. 351–357. MIT Press, Cambridge (1996)Google Scholar
  6. 6.
    Seeger, M.: Bayesian model selection for support vector machines, Gaussian processes and other kernel classifiers. In: Solla, S.A., Leen, T.K., Müller, K.R. (eds.) Advances in Neural Information Processing Systems, vol. 12, pp. 603–609. MIT Press, Cambridge (2000)Google Scholar
  7. 7.
    Minka, T.P.: A family of algorithms for approximate Bayesian inference. PhD thesis, Massachusetts Institute of Technology (2001)Google Scholar
  8. 8.
    Opper, M., Winther, O.: Gaussian processes for classification: Mean field algorithms. Neural Computation 12, 2655–2684 (2000)CrossRefGoogle Scholar
  9. 9.
    Rätsch, G., Onoda, T., Müller, K.R.: Soft margins for adaboost. Machine Learning 42(3), 287–320 (2001)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Nathaniel J. King
    • 1
  • Neil D. Lawrence
    • 1
  1. 1.Department of Computer ScienceUniversity of Sheffield, Regent CourtSheffieldUnited Kingdom

Personalised recommendations