Bayesian nonparametric regression with varying residual density

  • Debdeep PatiEmail author
  • David B. Dunson


We consider the problem of robust Bayesian inference on the mean regression function allowing the residual density to change flexibly with predictors. The proposed class of models is based on a Gaussian process (GP) prior for the mean regression function and mixtures of Gaussians for the collection of residual densities indexed by predictors. Initially considering the homoscedastic case, we propose priors for the residual density based on probit stick-breaking mixtures. We provide sufficient conditions to ensure strong posterior consistency in estimating the regression function, generalizing existing theory focused on parametric residual distributions. The homoscedastic priors are generalized to allow residual densities to change nonparametrically with predictors through incorporating GP in the stick-breaking components. This leads to a robust Bayesian regression procedure that automatically down-weights outliers and influential observations in a locally adaptive manner. The methods are illustrated using simulated and real data applications.


Data augmentation Exact block Gibbs sampler Gaussian process Nonparametric regression Outliers Symmetrized probit stick-breaking process 



This research was partially supported by grant number R01 ES017240-01 from the National Institute of Environmental Health Sciences (NIEHS) of the National Institutes of Health (NIH). The authors would like to thank Prof. Taeryon Choi for helpful suggestions on the manuscript.


  1. Adler, R.J. (1990). An introduction to continuity, extrema, and related topics for general Gaussian processes (vol. 12). Hayward: Academic Press.Google Scholar
  2. Albert, J., Chib, S. (2001). Sequential ordinal modeling with applications to survival data. Biometrics, 57(3), 829–836.Google Scholar
  3. Amewou-Atisso, M., Ghoshal, S., Ghosh, J.K., Ramamoorthi, R.V. (2003). Posterior consistency for semi-parametric regression problems. Bernoulli, 9(2), 291–312.Google Scholar
  4. Arellano-Vallea, R.B., Galea-Rojasb, M., Zuazola, P.I. (2000). Bayesian sensitivity analysis in elliptical linear regression models. Journal of Statistical Planning and Inference, 86, 175–199.Google Scholar
  5. Burr, D., Doss, H. (2005). A Bayesian semiparametric model for random-effects meta-analysis. Journal of the American Statistical Association, 100, 242–251.Google Scholar
  6. Bush, C., MacEachern, S. (1996). A semiparametric bayesian model for randomised block designs. Biometrika, 83(2), 275.Google Scholar
  7. Chan, D., Kohn, R., Nott, D., Kirby, C. (2006). Locally adaptive semiparametric estimation of the mean and variance functions in regression models. Journal of Computational and Graphical Statistics, 15, 915–936.Google Scholar
  8. Chib, S., Greenberg, E. (2010). Additive cubic spline regression with dirichlet process mixture errors. Journal of Econometrics, 156(2), 322–336.Google Scholar
  9. Chipman, H.A., George, E.I., Mcculloch, R.E. (2010). Bart: Bayesian additive regression trees. The Annals of Applied Statistics, 4(1), 266–298.Google Scholar
  10. Choi, T. (2005). Posterior consistency in nonparametric regression problems in gaussian process priors. PhD thesis. Pittsburgh: Department of Statistics, Carnegie Mellon University.Google Scholar
  11. Choi, T. (2009). Asymptotic properties of posterior distributions in nonparametric regression with non-Gaussian errors. Annals of the Institute of Statistical Mathematics, 61(4), 835–859.Google Scholar
  12. Choi, T., Schervish, M.J. (2007). On posterior consistency in nonparametric regression problems. Journal of Multivariate Analysis, 10, 1969–1987.Google Scholar
  13. Chu, K.C. (1973). Estimation and detection in linear systems with elliptical errors. Institute of Electrical and Electronics Engineers, Transactions on Automatic Control, 18, 499–505.Google Scholar
  14. Chung, Y., Dunson, D. (2009). Nonparametric Bayes conditional distribution modeling with variable selection. Journal of the American Statistical Association, 104(488), 1646–1660.Google Scholar
  15. Cramér, H., Leadbetter, M.R. (1967). Stationary and related stochastic processes, sample function properties and their applications. New York: John Wiley.Google Scholar
  16. Denison, D., Holmes, C., Mallick, B., Smith, A.F.M. (2002). Bayesian methods for nonlinear classification and regression. London: Wiley.Google Scholar
  17. Doss, H. (1985). Bayesian nonparametric estimation of the median; part I: computation of the estimates. The Annals of Statistics, 13(4), 1432–1444.Google Scholar
  18. Dunson, D., Park, J. (2008). Kernel stick-breaking processes. Biometrika, 95(2), 307–323.Google Scholar
  19. Dunson, D.B., Pillai, N., Park, J.H. (2007). Bayesian density regression. Journal of the Royal Statistical Society: Series B, 69, 163–183.Google Scholar
  20. Escobar, M.D., West, M. (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90(430), 577–588.Google Scholar
  21. Ferguson, T.S. (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 1(2), 209–230.Google Scholar
  22. Ferguson, T.S. (1974). Prior distributions on spaces of probability measures. The Annals of Statistics, 2(4), 615–629.Google Scholar
  23. Fonseca, T.C.O., Ferreira, M.A.R., Migon, H.S. (2008). Objective Bayesian analysis for the Student-t regression model. Biometrika, 95(2), 325–333.Google Scholar
  24. Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. Bayesian Statistics, 4, 169–194.Google Scholar
  25. Ghosal, S., Roy, A. (2006). Posterior consistency of Gaussian process prior in nonparametric binary regression. The Annals of Statistics, 34(5), 2413–2429.Google Scholar
  26. Gramacy, R., Lee, H. (2008). Bayesian treed Gaussian process models with an application to computer modeling. Journal of the American Statistical Association, 103(483), 1119–1130.Google Scholar
  27. Griffin, J., Steel, M.F.J. (2006). Order-based dependent Dirichlet processes. Journal of the American Statistical Association, Theory and Methods, 101(473), 179–194.Google Scholar
  28. Griffin, J.E., Steel, M.F. (2010). Bayesian nonparametric modelling with the dirichlet process regression smoother. Statistica Sinica, 20(4), 1507.Google Scholar
  29. Huber, P.J. (1964). Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35(1), 73–101.Google Scholar
  30. Ishwaran, H., James, L. (2001). Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association, 96(453), 161–173.Google Scholar
  31. James, L.F., Lijoi, A., Prünster, I. (2005). Bayesian nonparametric inference via classes of normalized random measures. In Technical report, International Centre for Economic Research, Applied Mathematics Working Papers Series 5/2005. Italy: University of Turin.Google Scholar
  32. Kalli, M., Griffin, J., Walker, S. (2010). Slice sampling mixture models. Statistics and Computing, 1–13.Google Scholar
  33. Kottas, A., Gelfand, A.E. (2001). Bayesian semiparametric median regression modeling. Journal of the American Statistical Association, 96(456), 1458–1468.Google Scholar
  34. Lange, K., Little, R.J.A., Taylor, J.M.G. (1989). Robust statistical modelling using the t distribution. Journal of the American Statistical Association, 84(408), 881–896.Google Scholar
  35. Lavine, M., Mockus, A. (2005). A nonparametric Bayes method for isotonic regression. Journal of Statistical Planning and Inference, 46, 235–248.Google Scholar
  36. Lo, A.Y. (1984). On a class of Bayesian nonparametric estimates. I: density estimates. The Annals of Statistics, 12(1), 351–357.Google Scholar
  37. Müller, P., Erkanli, A., West, M. (1996). Bayesian curve fitting using multivariate normal mixtures. Biometrika, 83(1), 67–79.Google Scholar
  38. Neal, R.J. (1998). Regression and classification using Gaussian process priors. Bayesian Statistics, 6, 475–501.Google Scholar
  39. Norets, A., Pelenis, J. (2010). Posterior consistency in conditional distribution estimation by covariate dependent mixtures. USA: Princeton University (unpublished manuscript).Google Scholar
  40. Norets, A., Pelenis, J. (2011). Bayesian semiparametric regression. USA: Princeton University (unpublished manuscript).Google Scholar
  41. Nott, D. (2006). Semiparametric estimation of mean and variance functions for non-Gaussian data. Computational Statistics, 21, 603–620.Google Scholar
  42. Ongaro, A., Cattaneo, C. (2004). Discrete random probability measures: a general framework for nonparametric Bayesian inference. Statistics and Probability Letters, 67, 33–45.Google Scholar
  43. Papaspiliopoulos, O. (2008). A note on posterior sampling from Dirichlet mixture models. In Technical Report 08–20, Centre for Research in Statistical Methodology. UK: University of Warwick.Google Scholar
  44. Papaspiliopoulos, O., Roberts, G. (2008). Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika, 95, 169–183.Google Scholar
  45. Pati, D., Dunson, D.B., Tokdar, S.T. (2013). Posterior consistency in conditional distribution estimation. Journal of Multivariate Analysis, 116, 456–472.Google Scholar
  46. Raftery, A.E., Lewis, S. (1992). How many iterations in the Gibbs sampler? Bayesian Statistics, 4, 763–773.Google Scholar
  47. Rasmussen, C., Williams, C. (2006). Gaussian processes for machine learning (adaptive computation and machine learning). Cambridge: The MIT Press.Google Scholar
  48. Rodriguez, A., Dunson, D. (2011). Nonparametric Bayesian models through probit stick-breaking processes. Bayesian Analysis, 6(1), 145–178.Google Scholar
  49. Schwartz, L. (1965). On Bayes procedures. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 4, 10–26.Google Scholar
  50. Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statistica Sinica, 4, 639–650.Google Scholar
  51. Tokdar, S.T. (2006). Posterior consistency of Dirichlet location-scale mixture of normals in density estimation and regression. Sankhyā, 68, 90–110.Google Scholar
  52. van der Vaart, A., van Zanten, J. (2008). Reproducing kernel Hilbert spaces of Gaussian priors. IMS Collections, 3, 200–222.Google Scholar
  53. van der Vaart, A., van Zanten, J. (2009). Adaptive Bayesian estimation using a Gaussian random field with inverse gamma bandwidth. The Annals of Statistics, 37(5B), 2655–2675.Google Scholar
  54. van der Vaart, A.W., Wellner, J.A. (1996). Weak convergence and empirical processes. New York: Springer-Verlag.Google Scholar
  55. Walker, S.G. (2007). Sampling the Dirichlet mixture model with slices. Communications in Statistics: Simulation and Computation, 36, 45–54.Google Scholar
  56. Weiss, R. (1996). An approach to Bayesian sensitivity analysis. Journal of the Royal Statistical Society Series B (Methodological), 58(4), 739–750.Google Scholar
  57. West, M. (1984). Outlier models and prior distributions in Bayesian linear regression. Journal of the Royal Statistical Society Series B, 46(3), 431–439.Google Scholar
  58. West, M. (1987). On scale mixtures of normal distributions. Biometrika, 74(3), 646–648.Google Scholar
  59. Wu, Y., Ghoshal, S. (2008). Kullback Leibler property of kernel mixture priors in Bayesian density estimation. Electronic Journal of Statistics, 2, 298–331.Google Scholar
  60. Yau, P., Kohn, R. (2003). Estimation and variable selection in nonparametric heteroscedastic regression. Statistics and Computing, 13, 191–208.Google Scholar

Copyright information

© The Institute of Statistical Mathematics, Tokyo 2013

Authors and Affiliations

  1. 1.Department of StatisticsFlorida State UniversityTallahasseeUSA
  2. 2.Department of Statistical ScienceDuke UniversityDurhamUSA

Personalised recommendations