Knowledge and Information Systems

, Volume 55, Issue 2, pp 415–435 | Cite as

Distributed robust Gaussian Process regression

  • Sebastian MairEmail author
  • Ulf Brefeld
Regular Paper


We study distributed and robust Gaussian Processes where robustness is introduced by a Gaussian Process prior on the function values combined with a Student-t likelihood. The posterior distribution is approximated by a Laplace Approximation, and together with concepts from Bayesian Committee Machines, we efficiently distribute the computations and render robust GPs on huge data sets feasible. We provide a detailed derivation and report on empirical results. Our findings on real and artificial data show that our approach outperforms existing baselines in the presence of outliers by using all available data.


Robust regression Gaussian Process regression Student-t likelihood Laplace Approximation Distributed computation 


  1. 1.
    Cao Y, Fleet DJ (2014) Generalized product of experts for automatic and principled fusion of Gaussian process predictions. In: Modern nonparametrics 3: automating the learning pipeline workshop at NIPSGoogle Scholar
  2. 2.
    Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)?—arguments against avoiding RMSE in the literature. Geosci Model Dev 7(3):1247–1250CrossRefGoogle Scholar
  3. 3.
    Chen J, Cao N, Low KH, Ouyang R, Tan CKY, Jaillet P (2013) Parallel Gaussian process regression with low-rank covariance matrix approximations. In: Proceedings of the twenty-ninth conference on uncertainty in artificial intelligence, AUAI Press, pp 152–161Google Scholar
  4. 4.
    Deisenroth MP, Ng JW (2015) Distributed Gaussian processes. In: Proceedings of the 32nd international conference on machine learning (ICML-15), pp 1481–1490Google Scholar
  5. 5.
    Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 1:1–38MathSciNetzbMATHGoogle Scholar
  6. 6.
    Gal Y, van der Wilk M, Rasmussen C (2014) Distributed variational inference in sparse Gaussian process regression and latent variable models. In: Advances in neural information processing systems, pp 3257–3265Google Scholar
  7. 7.
    Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis. CRC Press, Boca RatonzbMATHGoogle Scholar
  8. 8.
    Geweke J (1993) Bayesian treatment of the independent Student-t linear model. J Appl Econom 8(S1):S19–S40CrossRefGoogle Scholar
  9. 9.
    Harrison D, Rubinfeld DL (1978) Hedonic housing prices and the demand for clean air. J Environ Econ Manage 5(1):81–102CrossRefzbMATHGoogle Scholar
  10. 10.
    Hensman J, Fusi N, Lawrence ND (2013) Gaussian processes for big data. In: Proceedings of the twenty-ninth conference on uncertainty in artificial intelligence, AUAI Press, pp 282–290Google Scholar
  11. 11.
    Jaynes E, Bretthorst G (2003) Probability theory: the logic of science. Cambridge university press, CambridgeCrossRefGoogle Scholar
  12. 12.
    Jylänki P, Vanhatalo J, Vehtari A (2011) Robust Gaussian process regression with a Student-t likelihood. J Mach Learn Res 12:3227–3257MathSciNetzbMATHGoogle Scholar
  13. 13.
    Kuss M (2006) Gaussian process models for robust regression, classification, and reinforcement learning. PhD thesis, Technische Universität DarmstadtGoogle Scholar
  14. 14.
    Naish-Guzman A, Holden S (2008) Robust regression with twinned Gaussian processes. In: Advances in neural information processing systems, pp 1065–1072Google Scholar
  15. 15.
    Neal R (1997) Monte Carlo implementation of Gaussian process models for Bayesian regression and classification. Technical report, Department of Statistics, University of TorontoGoogle Scholar
  16. 16.
    Nickisch H, Rasmussen CE (2008) Approximations for binary Gaussian process classification. J Mach Learn Res 9(10):2035–2078MathSciNetzbMATHGoogle Scholar
  17. 17.
    O’Hagan A (1979) On outlier rejection phenomena in Bayes inference. J R Stat Soc Ser B (Methodol) 41:358–367MathSciNetzbMATHGoogle Scholar
  18. 18.
    Quiñonero-Candela J, Rasmussen CE (2005) A unifying view of sparse approximate Gaussian process regression. J Mach Learn Res 6:1939–1959MathSciNetzbMATHGoogle Scholar
  19. 19.
    Rasmussen C, Williams C (2006) Gaussian processes for machine learning. Adaptive computation and machine learning, MIT Press, Cambridge.
  20. 20.
    Tipping ME, Lawrence ND (2005) Variational inference for Student-t models: robust Bayesian interpolation and generalised component analysis. Neurocomputing 69(1):123–141CrossRefGoogle Scholar
  21. 21.
    Titsias MK (2009) Variational learning of inducing variables in sparse Gaussian processes. In: International conference on artificial intelligence and statistics, pp 567–574Google Scholar
  22. 22.
    Tresp V (2000) A Bayesian committee machine. Neural Comput 12(11):2719–2741CrossRefGoogle Scholar
  23. 23.
    Vanhatalo J, Jylänki P, Vehtari A (2009) Gaussian process regression with Student-t likelihood. In: Advances in neural information processing systems, pp 1910–1918Google Scholar
  24. 24.
    Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30(1):79CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd. 2017

Authors and Affiliations

  1. 1.Leuphana University of LüneburgLüneburgGermany

Personalised recommendations