Skip to main content
Log in

Bayesian nonparametric regression with varying residual density

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

We consider the problem of robust Bayesian inference on the mean regression function allowing the residual density to change flexibly with predictors. The proposed class of models is based on a Gaussian process (GP) prior for the mean regression function and mixtures of Gaussians for the collection of residual densities indexed by predictors. Initially considering the homoscedastic case, we propose priors for the residual density based on probit stick-breaking mixtures. We provide sufficient conditions to ensure strong posterior consistency in estimating the regression function, generalizing existing theory focused on parametric residual distributions. The homoscedastic priors are generalized to allow residual densities to change nonparametrically with predictors through incorporating GP in the stick-breaking components. This leads to a robust Bayesian regression procedure that automatically down-weights outliers and influential observations in a locally adaptive manner. The methods are illustrated using simulated and real data applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adler, R.J. (1990). An introduction to continuity, extrema, and related topics for general Gaussian processes (vol. 12). Hayward: Academic Press.

  • Albert, J., Chib, S. (2001). Sequential ordinal modeling with applications to survival data. Biometrics, 57(3), 829–836.

    Google Scholar 

  • Amewou-Atisso, M., Ghoshal, S., Ghosh, J.K., Ramamoorthi, R.V. (2003). Posterior consistency for semi-parametric regression problems. Bernoulli, 9(2), 291–312.

    Google Scholar 

  • Arellano-Vallea, R.B., Galea-Rojasb, M., Zuazola, P.I. (2000). Bayesian sensitivity analysis in elliptical linear regression models. Journal of Statistical Planning and Inference, 86, 175–199.

    Google Scholar 

  • Burr, D., Doss, H. (2005). A Bayesian semiparametric model for random-effects meta-analysis. Journal of the American Statistical Association, 100, 242–251.

    Google Scholar 

  • Bush, C., MacEachern, S. (1996). A semiparametric bayesian model for randomised block designs. Biometrika, 83(2), 275.

    Google Scholar 

  • Chan, D., Kohn, R., Nott, D., Kirby, C. (2006). Locally adaptive semiparametric estimation of the mean and variance functions in regression models. Journal of Computational and Graphical Statistics, 15, 915–936.

    Google Scholar 

  • Chib, S., Greenberg, E. (2010). Additive cubic spline regression with dirichlet process mixture errors. Journal of Econometrics, 156(2), 322–336.

    Google Scholar 

  • Chipman, H.A., George, E.I., Mcculloch, R.E. (2010). Bart: Bayesian additive regression trees. The Annals of Applied Statistics, 4(1), 266–298.

  • Choi, T. (2005). Posterior consistency in nonparametric regression problems in gaussian process priors. PhD thesis. Pittsburgh: Department of Statistics, Carnegie Mellon University.

  • Choi, T. (2009). Asymptotic properties of posterior distributions in nonparametric regression with non-Gaussian errors. Annals of the Institute of Statistical Mathematics, 61(4), 835–859.

    Google Scholar 

  • Choi, T., Schervish, M.J. (2007). On posterior consistency in nonparametric regression problems. Journal of Multivariate Analysis, 10, 1969–1987.

    Google Scholar 

  • Chu, K.C. (1973). Estimation and detection in linear systems with elliptical errors. Institute of Electrical and Electronics Engineers, Transactions on Automatic Control, 18, 499–505.

  • Chung, Y., Dunson, D. (2009). Nonparametric Bayes conditional distribution modeling with variable selection. Journal of the American Statistical Association, 104(488), 1646–1660.

    Google Scholar 

  • Cramér, H., Leadbetter, M.R. (1967). Stationary and related stochastic processes, sample function properties and their applications. New York: John Wiley.

  • Denison, D., Holmes, C., Mallick, B., Smith, A.F.M. (2002). Bayesian methods for nonlinear classification and regression. London: Wiley.

  • Doss, H. (1985). Bayesian nonparametric estimation of the median; part I: computation of the estimates. The Annals of Statistics, 13(4), 1432–1444.

  • Dunson, D., Park, J. (2008). Kernel stick-breaking processes. Biometrika, 95(2), 307–323.

    Google Scholar 

  • Dunson, D.B., Pillai, N., Park, J.H. (2007). Bayesian density regression. Journal of the Royal Statistical Society: Series B, 69, 163–183.

    Google Scholar 

  • Escobar, M.D., West, M. (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90(430), 577–588.

    Google Scholar 

  • Ferguson, T.S. (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 1(2), 209–230.

    Google Scholar 

  • Ferguson, T.S. (1974). Prior distributions on spaces of probability measures. The Annals of Statistics, 2(4), 615–629.

    Google Scholar 

  • Fonseca, T.C.O., Ferreira, M.A.R., Migon, H.S. (2008). Objective Bayesian analysis for the Student-t regression model. Biometrika, 95(2), 325–333.

    Google Scholar 

  • Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. Bayesian Statistics, 4, 169–194.

    Google Scholar 

  • Ghosal, S., Roy, A. (2006). Posterior consistency of Gaussian process prior in nonparametric binary regression. The Annals of Statistics, 34(5), 2413–2429.

    Google Scholar 

  • Gramacy, R., Lee, H. (2008). Bayesian treed Gaussian process models with an application to computer modeling. Journal of the American Statistical Association, 103(483), 1119–1130.

    Google Scholar 

  • Griffin, J., Steel, M.F.J. (2006). Order-based dependent Dirichlet processes. Journal of the American Statistical Association, Theory and Methods, 101(473), 179–194.

    Google Scholar 

  • Griffin, J.E., Steel, M.F. (2010). Bayesian nonparametric modelling with the dirichlet process regression smoother. Statistica Sinica, 20(4), 1507.

    Google Scholar 

  • Huber, P.J. (1964). Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35(1), 73–101.

  • Ishwaran, H., James, L. (2001). Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association, 96(453), 161–173.

    Google Scholar 

  • James, L.F., Lijoi, A., Prünster, I. (2005). Bayesian nonparametric inference via classes of normalized random measures. In Technical report, International Centre for Economic Research, Applied Mathematics Working Papers Series 5/2005. Italy: University of Turin.

  • Kalli, M., Griffin, J., Walker, S. (2010). Slice sampling mixture models. Statistics and Computing, 1–13.

  • Kottas, A., Gelfand, A.E. (2001). Bayesian semiparametric median regression modeling. Journal of the American Statistical Association, 96(456), 1458–1468.

    Google Scholar 

  • Lange, K., Little, R.J.A., Taylor, J.M.G. (1989). Robust statistical modelling using the t distribution. Journal of the American Statistical Association, 84(408), 881–896.

    Google Scholar 

  • Lavine, M., Mockus, A. (2005). A nonparametric Bayes method for isotonic regression. Journal of Statistical Planning and Inference, 46, 235–248.

    Google Scholar 

  • Lo, A.Y. (1984). On a class of Bayesian nonparametric estimates. I: density estimates. The Annals of Statistics, 12(1), 351–357.

    Google Scholar 

  • Müller, P., Erkanli, A., West, M. (1996). Bayesian curve fitting using multivariate normal mixtures. Biometrika, 83(1), 67–79.

    Google Scholar 

  • Neal, R.J. (1998). Regression and classification using Gaussian process priors. Bayesian Statistics, 6, 475–501.

    Google Scholar 

  • Norets, A., Pelenis, J. (2010). Posterior consistency in conditional distribution estimation by covariate dependent mixtures. USA: Princeton University (unpublished manuscript).

  • Norets, A., Pelenis, J. (2011). Bayesian semiparametric regression. USA: Princeton University (unpublished manuscript).

  • Nott, D. (2006). Semiparametric estimation of mean and variance functions for non-Gaussian data. Computational Statistics, 21, 603–620.

    Google Scholar 

  • Ongaro, A., Cattaneo, C. (2004). Discrete random probability measures: a general framework for nonparametric Bayesian inference. Statistics and Probability Letters, 67, 33–45.

    Google Scholar 

  • Papaspiliopoulos, O. (2008). A note on posterior sampling from Dirichlet mixture models. In Technical Report 08–20, Centre for Research in Statistical Methodology. UK: University of Warwick.

  • Papaspiliopoulos, O., Roberts, G. (2008). Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika, 95, 169–183.

    Google Scholar 

  • Pati, D., Dunson, D.B., Tokdar, S.T. (2013). Posterior consistency in conditional distribution estimation. Journal of Multivariate Analysis, 116, 456–472.

    Google Scholar 

  • Raftery, A.E., Lewis, S. (1992). How many iterations in the Gibbs sampler? Bayesian Statistics, 4, 763–773.

    Google Scholar 

  • Rasmussen, C., Williams, C. (2006). Gaussian processes for machine learning (adaptive computation and machine learning). Cambridge: The MIT Press.

  • Rodriguez, A., Dunson, D. (2011). Nonparametric Bayesian models through probit stick-breaking processes. Bayesian Analysis, 6(1), 145–178.

    Google Scholar 

  • Schwartz, L. (1965). On Bayes procedures. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 4, 10–26.

  • Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statistica Sinica, 4, 639–650.

    Google Scholar 

  • Tokdar, S.T. (2006). Posterior consistency of Dirichlet location-scale mixture of normals in density estimation and regression. Sankhyā, 68, 90–110.

    Google Scholar 

  • van der Vaart, A., van Zanten, J. (2008). Reproducing kernel Hilbert spaces of Gaussian priors. IMS Collections, 3, 200–222.

    Google Scholar 

  • van der Vaart, A., van Zanten, J. (2009). Adaptive Bayesian estimation using a Gaussian random field with inverse gamma bandwidth. The Annals of Statistics, 37(5B), 2655–2675.

    Google Scholar 

  • van der Vaart, A.W., Wellner, J.A. (1996). Weak convergence and empirical processes. New York: Springer-Verlag.

  • Walker, S.G. (2007). Sampling the Dirichlet mixture model with slices. Communications in Statistics: Simulation and Computation, 36, 45–54.

    Google Scholar 

  • Weiss, R. (1996). An approach to Bayesian sensitivity analysis. Journal of the Royal Statistical Society Series B (Methodological), 58(4), 739–750.

    Google Scholar 

  • West, M. (1984). Outlier models and prior distributions in Bayesian linear regression. Journal of the Royal Statistical Society Series B, 46(3), 431–439.

    Google Scholar 

  • West, M. (1987). On scale mixtures of normal distributions. Biometrika, 74(3), 646–648.

    Google Scholar 

  • Wu, Y., Ghoshal, S. (2008). Kullback Leibler property of kernel mixture priors in Bayesian density estimation. Electronic Journal of Statistics, 2, 298–331.

    Google Scholar 

  • Yau, P., Kohn, R. (2003). Estimation and variable selection in nonparametric heteroscedastic regression. Statistics and Computing, 13, 191–208.

    Google Scholar 

Download references

Acknowledgments

This research was partially supported by grant number R01 ES017240-01 from the National Institute of Environmental Health Sciences (NIEHS) of the National Institutes of Health (NIH). The authors would like to thank Prof. Taeryon Choi for helpful suggestions on the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Debdeep Pati.

About this article

Cite this article

Pati, D., Dunson, D.B. Bayesian nonparametric regression with varying residual density. Ann Inst Stat Math 66, 1–31 (2014). https://doi.org/10.1007/s10463-013-0415-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-013-0415-z

Keywords

Navigation