Computational Statistics

, Volume 15, Issue 3, pp 391–420 | Cite as

Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models

  • William J. Browne
  • David Draper


We use simulation studies (a) to compare Bayesian and likelihood fitting methods, in terms of validity of conclusions, in two-level random-slopes regression (RSR) models, and (b) to compare several Bayesian estimation methods based on Markov chain Monte Carlo, in terms of computational efficiency, in random-effects logistic regression (RELR) models. We find (a) that the Bayesian approach with a particular choice of diffuse inverse Wishart prior distribution for the (co) variance parameters performs at least as well—in terms of bias of estimates and actual coverage of nominal 95% intervals—as maximum likelihood methods in RSR models with medium sample sizes (expressed in terms of the number J 7 of level-2 units), but neither approach performs as well as might be hoped with small J; and (b) that an adaptive hybrid Metropolis-Gibbs sampling method we have developed for use in the multilevel modeling package M1wiN outperforms adaptive rejection Gibbs sampling in the RELR models we have considered, sometimes by a wide margin.


Adaptive Metropolis Sampling Diffuse Prior Distributions Educational Data Gibbs Sampling Hierarchical Modeling IGLS Markov Chain Monte Carlo (MCMC) MCMC Efficiency Maximum Likelihood Methods Random-Effects Logistic Regression Random-Slopes Regression RIGLS Variance Components 



The authors, who may be contacted by email at and, respectively, are grateful (a) to Harvey Goldstein and Jon Rasbash for a fruitful collaboration on MLwiN and for helpful discussions and comments, (b) to the EPSRC and ESRC for financial support, and (c) to Jim Hodges, Herbert Hoijtink, Dennis Lindley, and Steve Raudenbush for references and comments on this and/or related papers. Membership on this list does not imply agreement with the ideas expressed here, nor are any of these people responsible for any errors that may be present.


  1. Breslow, N.E. and Clayton, D.G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88, 9–25.zbMATHGoogle Scholar
  2. Brooks, S.P. and Draper, D. (2000). Comparing the efficiency of MCMC samplers. Technical report, Department of Mathematical Sciences, University of Bath, UK.Google Scholar
  3. Browne, W.J. (1998). Applying MCMC Methods to Multilevel Models. PhD dissertation, Department of Mathematical Sciences, University of Bath, UK.Google Scholar
  4. Browne, W.J. and Draper, D. (1999). A comparison of Bayesian and likelihood methods for fitting multilevel models. Submitted.Google Scholar
  5. Bryk, A.S. and Raudenbush, S.W. (1992). Hierarchical Linear Models: Applications and Data Analysis Methods. London: Sage.Google Scholar
  6. Bryk, A.S., Raudenbush, S.W., Seltzer, M. and Congdon, R. (1988). An Introduction to HLM: Computer Program and User’s Guide (Second Edition). Chicago: University of Chicago Department of Education.Google Scholar
  7. Carlin, B. (1992). Discussion of “Hierarchical models for combining information and for meta-analysis,” by Morris, C.N. and Normand, S.L. In Bayesian Statistics4, Bernardo, J.M., Berger, J.O., Dawid, A.P. and Smith, A.F.M. (eds.), 336–338. Oxford: Clarendon Press.Google Scholar
  8. Draper, D. (1995). Assessment and propagation of model uncertainty (with discussion). Journal of the Royal Statistical Society, Series B, 57, 45–97.MathSciNetzbMATHGoogle Scholar
  9. Draper, D. (2000). Bayesian Hierarchical Modeling. New York: Springer-Verlag, forthcoming.Google Scholar
  10. Gelman, A., Carlin, J.B., Stern, H.S. and Rubin, D.B. (1995). Bayesian Data Analysis. London: Chapman & Hall.CrossRefGoogle Scholar
  11. Gelman, A., Roberts, G.O. and Gilks, W.R. (1995). Efficient Metropolis jumping rules. In Bayesian Statistics 5, Bernardo, J.M., Berger, J.O., Dawid, A.P. and Smith, A.F.M. (eds.), 599–607. Oxford: Clarendon Press.Google Scholar
  12. Gelman, A. and Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences (with discussion). Statistical Science, 7, 457–511.CrossRefGoogle Scholar
  13. Gilks, W.R., Richardson, S. and Spiegelhalter, D.J. (1996). Markov Chain Monte Carlo in Practice. London: Chapman & Hall.zbMATHGoogle Scholar
  14. Gilks, W.R. and Wild, P. (1992). Adaptive rejection sampling for Gibbs sampling. Applied Statistics, 41, 337–348.CrossRefGoogle Scholar
  15. Goldstein, H. (1986). Multilevel mixed linear model analysis using iterative generalised least squares. Biometrika, 73, 43–56.MathSciNetCrossRefGoogle Scholar
  16. Goldstein, H. (1989). Restricted unbiased iterative generalised least squares estimation. Biometrika, 76, 622–623.MathSciNetCrossRefGoogle Scholar
  17. Goldstein, H. (1995). Multilevel Statistical Models, Second Edition. London: Edward Arnold.zbMATHGoogle Scholar
  18. Heath, A., Yang, M. and Goldstein, H. (1996). Multilevel analysis of the changing relationship between class and party in Britain, 1964–1992. Quality and Quantity, 30, 389–404.CrossRefGoogle Scholar
  19. Liu, J. and Hodges, J.S. (1999). Characterizing modes of the likelihood, restricted likelihood, and posterior for hierarchical models. Technical Report 99–011, Division of Biostatistics, University of Minnesota.Google Scholar
  20. Longford, N.T. (1987). A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects. Biometrika, 74, 817–827.MathSciNetCrossRefGoogle Scholar
  21. Madigan, D., Gavrin, J. and Raftery, A.E. (1995). Eliciting prior information to enhance the predictive performance of Bayesian graphical models. Communications in Statistics, Theory and Methods, 24, 2271–2292.MathSciNetCrossRefGoogle Scholar
  22. Mortimore, P., Sammons, P., Stoll, L., Lewis, D. and Ecob, R. (1988). School Matters. Wells: Open Books.Google Scholar
  23. Müller, P. (1993). A generic approach to posterior integration and Gibbs sampling. Technical Report, ISDS, Duke University, Durham NC.Google Scholar
  24. Pinheiro, J.C. and Bates, D.M. (1995). Approximations to the log-likelihood function in the non-linear mixed-effects model. Journal of Computational and Graphical Statistics, 4, 12–35.Google Scholar
  25. Raftery, A.L. and Lewis, S. (1992). How many iterations in the Gibbs sampler? In Bayesian Statistics 4, Bernardo, J.M., Berger, J.O., Dawid, A.P. and Smith, A.F.M. (eds.), 763–774. Oxford: Clarendon Press.Google Scholar
  26. Rasbash, J., Browne, W.J., Goldstein, H., Yang, M., Plewis, I., Draper, D., Healy, M. and Woodhouse, G. (1999). A User’s Guide to MLwiN, Version 2.0, London: Institute of Education, University of London.Google Scholar
  27. Raudenbush, S.W., Yang, M.-L. and Yosef, M. (2000). Maximum likelihood for hierarchical models via high-order multivariate Laplace approximations. Journal of Computational and Graphical Statistics, forthcoming.Google Scholar
  28. Ripley, B.D. (1987). Stochastic Simulation. New York: Wiley.CrossRefGoogle Scholar
  29. Spiegelhalter, D.J., Thomas, A., Best, N.G. and Gilks, W.R. (1997). BUGS: Bayesian Inference Using Gibbs Sampling, Version 0.60. Cambridge: Medical Research Council Biostatistics Unit.Google Scholar
  30. Woodhouse, G., Rasbash, J., Goldstein, H., Yang, M., Howarth, J. and Plewis, I. (1995). A Guide toMLnfor New Users. London: Institute of Education, University of London.Google Scholar
  31. Zeger, S.L. and Karim, M.R. (1991). Generalized linear models with random effects: a Gibbs sampling approach. Journal of the American Statistical Association, 86, 79–86.MathSciNetCrossRefGoogle Scholar

Copyright information

© Physica-Verlag 2000

Authors and Affiliations

  • William J. Browne
    • 1
  • David Draper
    • 2
  1. 1.Institute of EducationUniversity of LondonLondonEngland
  2. 2.Department of Mathematical SciencesUniversity of BathBathEngland

Personalised recommendations