Abstract
Recent computational advances have made it feasible to fit hierarchical models in a wide range of serious applications. In the process, the question of model adequacy arises. While model checking usually addresses the entire model specification, model failures can occur at each hierarchical stage. Such failures include outliers, mean structures errors, dispersion misspecification, and inappropriate exchangeabilities. We propose an approach which is entirely simulation based. Given a model specification and a dataset, we need only be able to simulate draws from the resultant posterior. By replicating a posterior of interest using data obtained under the model we can “see” the extent of variability in such a posterior. Then, we can compare the posterior obtained under the observed data with this medley of posterior replicates to ascertain whether the former is in agreement with them and accordingly, whether it is plausible that the observed data came from the proposed model. Many such comparisons can be run, each focusing on a different potential model failure. Focusing on generalized linear mixed models, we explore the questions of when hierarchical model stages are separable and checkable and illustrate the approach with both real and simulated data.
Similar content being viewed by others
References
Albert, J.H. and S. Chib (1997). Bayesian tests and model diagnostics in conditionally independent hierarchical models.Journal of the American Statistical Association,92, 916–925.
Barnard, G.A. (1963). (in discussion).Journal of the Royal Statistical Society, Series B,25, 294–295.
Berger, J.O. (1985).Statistical Decision Theory and Bayesian Analysis. Springer-Verlag, New York.
Besag, J. and Clifford (1989). Generalized Monte Carlo significance tests.Biometrika,76, 633–642.
Box, G.E.P. (1980). Sampling and Bayes's inference in scientific modeling (with discussion).Journal of the Royal Statistical Society, Series A,143, 383–430.
Breslow, N.E. and D.G. Clayton (1993). Approximate inference in generalized linear mixed models.Journal of the American Statistical Association,88, 9–25.
Carota, C., G. Parmigiani and N.G. Polson (1997). Diagnostic measures for model criticism.Journal of the American Statistical Association,91, 753–762.
Chaloner, K. (1994). Residual analysis and outliers in Bayesian hierarchical models. In:Aspects of Uncertainty, eds. A.F.M. Smith and P.R. Freeman. Chichester, U.K., John Wiley, 153–161.
Chaloner, K. and R. Brant (1988). A Bayesian approach to outlier detection and residual analysis.Biometrika,75, 651–659.
Efron, B. (1996). Empirical Bayes methods for combining likelihoods (with discussion).Journal of the American Statistical Association,91, 538–565.
Freeman, P.R. (1980). On the number of outliers in data from a linear model. In:Bayesian Statistics, eds. J.M. Bernardo et al., Valencia, University Press, 349–365.
Gelfand, A.E., D.K. Dey and H. Chang (1992). Model determination using predictive distributions with implementations via sampling-based methods. In:Bayesian Statistics 4, eds. J.M. Bernardo et al., Oxford, U.K., Oxford University Press, 147–167.
Gelman, A., X-L. Meng and H.S. Stern (1995). Posterior predictive assessment of model fitness via realized discrepancies (with discussion).Statistica Sinica 6, 733–807.
Guttman, I., R. Dutter and P.R. Freeman (1978). Care and handling of multivariate outliers in the general linear model to detect spuriosity—a Bayesian approach.Technometrics,20, 187–193.
Hodges, J. (1998). Some algebra and geometry for hierarchical models, applied to diagnostics.Journal of the Royal Statistical Society, Series B,60, (to appear).
McCullagh, P. and J.A. Nelder (1989).Generalized Linear Models. Chapman and Hall, London.
Meng, X-L. (1994). Posterior predictive p-values.Annals of Statistics,22, 1142–1160.
Müller, P. and G. Parmigiani (1995). Numerical evaluation of information theoretic measures. In:Bayesian Statistics and Econometrics: Essays in Honor of A. Zellner. Eds: Berry, D.A., Chaloner, K.M., Geweke, J.F., John Wiley, New York, 397–406.
Pettit, L.I. and A.F.M. Smith (1985). Outliers and influential observations in linear models. In:Bayesian Statistics 2, J.M. Bernardo et al. eds., North Holland, Amsterdam, 473–494 (with discussion).
Rubin, D.B. (1994). Bayesianly justifiable and relevant frequency calculations for the applied statistician.Annals of Statistics,12, 1151–1172.
Sharples, L.D. (1990). Identification and accommodations of outliers in generalized hierarchical models.Biometrika,77, 445–452.
Tallis, G.M. and P. Chesson (1982). Identifiability of mixtures.Journal of the Australian Mathematical Society, Series A, 339–348.
Teicher, H. (1961). Identifiability of mixtures.Annals of Mathematical Statistics,32, 244–248.
Teicher, H. (1963). Identifiability of finite mixtures.Annals of Mathematical Statistics,34, 1265–1269.
Weiss, R.E. (1995). Residuals and outliers in repeated measures random effects models. Tech. Rpt. Dept. of Biostatistics, UCLA.
Author information
Authors and Affiliations
Additional information
Research supported in part by NSF SCREMS grant DMS-9506557, NSF grant DMS-9301316 and by the Natural Sciences and Engineering Research Council of Canada
Rights and permissions
About this article
Cite this article
Dey, D.K., Gelfand, A.E., Swartz, T.B. et al. A simulation-intensive approach for checking hierarchical models. Test 7, 325–346 (1998). https://doi.org/10.1007/BF02565116
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02565116
Key Words
- Discrepancy measures
- generalized linear mixed model
- Monte Carlo tests
- sampling-based model fitting
- stagewise model adequacy