Skip to main content
Log in

Detecting stage-wise outliers in hierarchical Bayesian linear models of repeated measures data

  • Outlier Detection
  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

We propose numerical and graphical methods for outlier detection in hierarchical Bayes modeling and analyses of repeated measures regression data from multiple subjects; data from a single subject are generically called a “curve”. The first-stage of our model has curve-specific regression coefficients with possibly autoregressive errors of a prespecified order. The first-stage regression vectors for different curves are linked in a second-stage modeling step, possibly involving additional regression variables. Detection of thestage at which the curve appears to be an outlier and themagnitude and specific component of the violation at that stage is accomplished by embedding the null model into a larger parametric model that can accommodate such unusual observations. We give two examples to illustrate the diagnostics, develop a BUGS program to compute them using MCMC techniques, and examine the sensitivity of the conclusions to the prior modeling assumptions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Becker, R. A., Cleveland, W. S. and Shyu, M.-J. (1996). The visual design and control of trellis display,Journal of Computational and Graphical Statistics,5, 123–155.

    Article  Google Scholar 

  • Berger, J. O. and Hui, S. L. (1983). Empirical Bayes estimation of rates in longitudinal studies,Journal of the American Statistical Association,78, 753–760.

    Article  Google Scholar 

  • Berlin, J. A., Santanna, J., Schmid, C. H., Szczech, L. A. and Feldman, H. I. (2002). Individual patient-versus group-level data meta-regressions for the investigation of treatment effect modifiers: Ecological bias rears its ugly head.Statistics in Medicine,21, 371–387.

    Article  Google Scholar 

  • Carota, C., Parmigiani, G. and Polson, N. G. (1996). Diagnostic measures for model criticism.Journal of the American Statistical Association,91, 753–762.

    Article  MathSciNet  Google Scholar 

  • Chaloner, K. (1994). Residual analysis and outliers in Bayesian hierarchical models,Aspects of Uncertainty. A Tribute to D. V. Lindley (eds. P. R. Freeman and A. F. M. Smith), 149–157, Wiley, Chichester.

    Google Scholar 

  • Chaloner, K. and Brant, R. (1988). A Bayesian approach to outlier detection and residual analysis,Biometrika,75, 651–659.

    Article  MathSciNet  Google Scholar 

  • Chen, M.-H. and Schmeiser, B. (1993). Performance of the Gibbs, hit-and-run, and metropolis samplers,Journal of Computational and Graphical Statistics,2, 251–272.

    Article  MathSciNet  Google Scholar 

  • Cnaan, A., Laird, N. M. and Slasor, P. (1997). Using the general linear mixed model to analyse unbalanced repeated measures and longitudinal data,Statistics in Medicine,16, 2349–2380.

    Article  Google Scholar 

  • Crowder, M. J. and Hand, D. J. (1990).Analysis of Repeated Measures, Chapman & Hall, New York.

    MATH  Google Scholar 

  • Gelfand, A. E. and Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities,Journal of the American Statistical Association,85, 398–409.

    Article  MathSciNet  Google Scholar 

  • George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling,Journal of the American Statistical Association,88, 881–889.

    Article  Google Scholar 

  • George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian variable selection,Statistica Sinica,7, 339–374.

    Google Scholar 

  • Ho, Y.-Y., Peruggia, M. and Santner, T. J. (1995). Diagnostics for hierarchical Bayesian repeated measures models,27th Symposium of the Interface: Computing Science and Statistics (eds. M. M. Meyer and J. L. Rosenberger), 387–391, Interface Foundation of North America, Fairfax Station, Virginia.

    Google Scholar 

  • Hodges, J. S. (1998). Some algebra and geometry for hierarchical models, applied to diagnostics (with discussion),Journal of the Royal Statistical Society, Series B,60, 497–536.

    Article  MathSciNet  Google Scholar 

  • Johnson, M. E., Moore, L. M. and Ylvisaker, D. (1990). Minimax and maximin distance designs,Journal of Statistical Planning and Inference,26, 131–148.

    Article  MathSciNet  Google Scholar 

  • Jones, M. C. and Rice, J. A. (1992). Displaying the important features of large collections of similar curves,The American Statistician,46, 140–145.

    Article  Google Scholar 

  • Jones, R. H. and Boadi-Boateng, F. (1991). Unequally spaced longitudinal data with AR(1) serial correlation,Biometrics,47, 161–175.

    Article  Google Scholar 

  • Joseph, L., Wolfson, D. B., Belisle, P., Brooks, J. O. 3rd, Mortimer, J. A., Tinklenberg, J. R. and Yesavage, J. A. (1999). Taking account of between-patient variability when modeling decline in alzheimer’s disease,American Journal of Epidemiology,149, 963–973.

    Google Scholar 

  • Justel, A. and Peña, D. (1996). Gibbs sampling will fail in outlier problems with strong masking,Journal of Computational and Graphical Statistics,5, 176–189.

    Article  MathSciNet  Google Scholar 

  • Koehler, J. R. and Owen, A. B. (1996). Computer experiments,Handbook of Statistics (eds. S. Ghosh and C. R. Rao), 261–308, North Holland, Elsevier, Amsterdam.

    Google Scholar 

  • Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data,Biometrics,38, 963–974.

    Article  Google Scholar 

  • Lambert, P. C., Abrams, K. R., Jones, D. R., Halligan, A. W. F. and Shennan, A. (2001). Analysis of ambulatory blood pressure monitor data using a hierarchical model incorporating restricted cubic splines and heterogeneous within-subject variances,Statistics in Medicine,20, 3789–3805.

    Article  Google Scholar 

  • Langford, I. H. and Lewis, T. (1998). Outliers in multilevel data (Disc: P153-160).Journal of the Royal Statistical Society, Series A, General,161, 121–153.

    Google Scholar 

  • Lindsey, J. K. (1993).Models for Repeated Measurements, Clarendon Press, Oxford.

    Google Scholar 

  • Lindstrom, M. J. and Bates, D. M. (1990). Nonlinear mixed effects models for repeated measures data,Biometrics,46, 673–687.

    Article  MathSciNet  Google Scholar 

  • McKay, M. D., Beckman, R. J. and Conover, W. J. (1979). A comparison of three methods for selecting values of input variables in the analysis of output from a computer code,Technometrics,21, 223–245.

    MathSciNet  Google Scholar 

  • Palmer, J. L. and Müller, P. (1998). Bayesian optimal design in population models for haematologic data,Statistics in Medicine,17, 1613–1622.

    Article  Google Scholar 

  • Pauler, D. K. and Laird, N. M. (2000). A mixture model for longitudinal data with application to assessment of noncompliance,Biometrics,56, 464–472.

    Article  Google Scholar 

  • Pauler, D. K. and Laird, N. M. (2000). Non-linear hierarchical models for monitoring compliance,Statistics in Medicine,21, 219–229.

    Article  Google Scholar 

  • Peruggia, M., Santner, T. J., Ho, Y. Y. and Macmillan, N. J. (1994). A hierarchical Bayesian analysis of circular data with autoregressive errors: Modeling the mechanical properties of cortical bone,Statistical Decision Theory and Related Topics V (eds. S. S. Gupta and J. O. Berger), 201–220, Springer-Verlag, New York.

    Google Scholar 

  • Pettit, L. I. and Smith, A. F. M. (1985). Outliers and influential observations in linear models,Bayesian Statistics II (eds. J. M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith), 473–494, North Holland, Elsevier, Amsterdam.

    Google Scholar 

  • Robert, C. P. and Casella, G. (1999).Monte Carlo Statistical Methods, Springer-Verlag, New York.

    MATH  Google Scholar 

  • Sacks, J., Welch, W. J., Mitchell T. J. and Wynn, H. P. (1989). Design and analysis of computer experiments,Statistical Sciences,4, 409–423.

    MathSciNet  Google Scholar 

  • Segal, M. R. (1994). Representative curves for longitudinal data via regression trees,Journal of Computational and Graphical Statistics,3, 214–233.

    Article  Google Scholar 

  • Sharples, L. D. (1990). Identification and accommodation of outliers in general hierarchical models,Biometrika,77, 445–453.

    Article  MathSciNet  Google Scholar 

  • Spiegelhalter, D. J. and Marshall, E. C. (1999). Inference-robust institutional comparisons: A case study of school examination results,Bayesian Statistics 6, Proceedings of the Sixth Valencia International Meeting (eds. J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith), 613–630, Clarendon Press, Oxford.

    Google Scholar 

  • Spiegelhalter, D. J., Thomas, A., Best, N. G. and Gilks, W. R. (1996)BUGS Bayesian Inference Using Gibbs Sampling, Version 0.5, (version ii), MRC Biostatistics Unit, Cambridge, U.K.

    Google Scholar 

  • Tan, M., Qu, Y., Mascha, E. and Schubert, A. (1999). A Bayesian hierarchical model for multi-level repeated ordinal data: Analysis of oral practice examinations in a large anaesthesiology training programme,Statistics in Medicine,18, 1983–1992.

    Article  Google Scholar 

  • Verdinelli, I. and Wasserman, L. (1991). Bayesian analysis of outlier problems using the Gibbs sampler,Statistics and Computing,1, 105–117.

    Article  Google Scholar 

  • Wakefield, J. C., Smith, A. F. M., Racine-Poon, A. and Gelfand, A. E. (1994). Bayesian analysis of linear and non-linear population models by using the Gibbs sampler,Applied Statistics,43, 201–221.

    Article  Google Scholar 

  • Weisberg, S. (1983). Comment on “Developments in linear regression methodology: 1959–1982”,Technometrics,25, 240–244.

    Article  Google Scholar 

  • Weiss, R. E. (1995). Residuals and outliers in repeated measures random effects models, Tech. Report, Department of Biostatistics, UCLA School of Public Health.

  • Welch, W. J. (1985). ACED: Algorithms for the construction of experimental designs,The American Statistician,39, p. 146.

    Article  Google Scholar 

  • Zellner, A. (1975). Bayesian analysis of regression error terms,Journal of the American Statistical Association,70, 138–144.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

About this article

Cite this article

Peruggia, M., Santner, T.J. & Ho, YY. Detecting stage-wise outliers in hierarchical Bayesian linear models of repeated measures data. Ann Inst Stat Math 56, 415–433 (2004). https://doi.org/10.1007/BF02530534

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02530534

Key words and phrases

Navigation