Skip to main content
Log in

Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models

  • Published:
Computational Statistics Aims and scope Submit manuscript

Summary

We use simulation studies (a) to compare Bayesian and likelihood fitting methods, in terms of validity of conclusions, in two-level random-slopes regression (RSR) models, and (b) to compare several Bayesian estimation methods based on Markov chain Monte Carlo, in terms of computational efficiency, in random-effects logistic regression (RELR) models. We find (a) that the Bayesian approach with a particular choice of diffuse inverse Wishart prior distribution for the (co) variance parameters performs at least as well—in terms of bias of estimates and actual coverage of nominal 95% intervals—as maximum likelihood methods in RSR models with medium sample sizes (expressed in terms of the number J 7 of level-2 units), but neither approach performs as well as might be hoped with small J; and (b) that an adaptive hybrid Metropolis-Gibbs sampling method we have developed for use in the multilevel modeling package M1wiN outperforms adaptive rejection Gibbs sampling in the RELR models we have considered, sometimes by a wide margin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

Similar content being viewed by others

Notes

  1. *For instance, from expert judgment (see, e.g., Madigan et al. 1995 for a method of eliciting a “prior data set” in the context of graphical models) or previous studies judged relevant to the current inquiry.

  2. Jim Hodges (personal communication) has recently noted that there may be more potential problems with multimodality of posterior distributions in hierarchical models than is commonly believed; see Liu and Hodges (1999) for details. This may be investigated in MLwiN by making parallel runs with widely dispersed starting values, as in Gelman and Rubin (1992).

  3. For example, if the user wished to report \({\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over \beta } _0} = 30.6 = 3.06 \cdot {10^1}\), i.e., k = 3, (10) would be applied with 6 = 1; whereas if 30 were subtracted from all data values and the user still insisted on k = 3, the estimate would now be 6.44 · 10−1, (10) would now be invoked with all the same inputs except b = −1, and the new \({\hat n_M}\) value would be 10,000 times larger than before. In effect, in the presence of Monte Carlo uncertainty, it is just as hard to accurately announce a posterior mean of 30.644 when the posterior SD is (say) 0.371 as it is to quote a posterior mean of 0.644 with the same posterior SD.

  4. §This is a potentially dangerous strategy in small-sample settings on grounds of failure to propagate model uncertainty (e.g., Draper 1995), but the corrections required to adjust for having performed model selection and fitting on the same data set with, e.g., 48 schools and 887 students (as in the JSP data) should be modest.

References

  • Breslow, N.E. and Clayton, D.G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88, 9–25.

    MATH  Google Scholar 

  • Brooks, S.P. and Draper, D. (2000). Comparing the efficiency of MCMC samplers. Technical report, Department of Mathematical Sciences, University of Bath, UK.

    Google Scholar 

  • Browne, W.J. (1998). Applying MCMC Methods to Multilevel Models. PhD dissertation, Department of Mathematical Sciences, University of Bath, UK.

    Google Scholar 

  • Browne, W.J. and Draper, D. (1999). A comparison of Bayesian and likelihood methods for fitting multilevel models. Submitted.

  • Bryk, A.S. and Raudenbush, S.W. (1992). Hierarchical Linear Models: Applications and Data Analysis Methods. London: Sage.

    Google Scholar 

  • Bryk, A.S., Raudenbush, S.W., Seltzer, M. and Congdon, R. (1988). An Introduction to HLM: Computer Program and User’s Guide (Second Edition). Chicago: University of Chicago Department of Education.

    Google Scholar 

  • Carlin, B. (1992). Discussion of “Hierarchical models for combining information and for meta-analysis,” by Morris, C.N. and Normand, S.L. In Bayesian Statistics4, Bernardo, J.M., Berger, J.O., Dawid, A.P. and Smith, A.F.M. (eds.), 336–338. Oxford: Clarendon Press.

    Google Scholar 

  • Draper, D. (1995). Assessment and propagation of model uncertainty (with discussion). Journal of the Royal Statistical Society, Series B, 57, 45–97.

    MathSciNet  MATH  Google Scholar 

  • Draper, D. (2000). Bayesian Hierarchical Modeling. New York: Springer-Verlag, forthcoming.

    Google Scholar 

  • Gelman, A., Carlin, J.B., Stern, H.S. and Rubin, D.B. (1995). Bayesian Data Analysis. London: Chapman & Hall.

    Book  Google Scholar 

  • Gelman, A., Roberts, G.O. and Gilks, W.R. (1995). Efficient Metropolis jumping rules. In Bayesian Statistics 5, Bernardo, J.M., Berger, J.O., Dawid, A.P. and Smith, A.F.M. (eds.), 599–607. Oxford: Clarendon Press.

    Google Scholar 

  • Gelman, A. and Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences (with discussion). Statistical Science, 7, 457–511.

    Article  Google Scholar 

  • Gilks, W.R., Richardson, S. and Spiegelhalter, D.J. (1996). Markov Chain Monte Carlo in Practice. London: Chapman & Hall.

    MATH  Google Scholar 

  • Gilks, W.R. and Wild, P. (1992). Adaptive rejection sampling for Gibbs sampling. Applied Statistics, 41, 337–348.

    Article  Google Scholar 

  • Goldstein, H. (1986). Multilevel mixed linear model analysis using iterative generalised least squares. Biometrika, 73, 43–56.

    Article  MathSciNet  Google Scholar 

  • Goldstein, H. (1989). Restricted unbiased iterative generalised least squares estimation. Biometrika, 76, 622–623.

    Article  MathSciNet  Google Scholar 

  • Goldstein, H. (1995). Multilevel Statistical Models, Second Edition. London: Edward Arnold.

    MATH  Google Scholar 

  • Heath, A., Yang, M. and Goldstein, H. (1996). Multilevel analysis of the changing relationship between class and party in Britain, 1964–1992. Quality and Quantity, 30, 389–404.

    Article  Google Scholar 

  • Liu, J. and Hodges, J.S. (1999). Characterizing modes of the likelihood, restricted likelihood, and posterior for hierarchical models. Technical Report 99–011, Division of Biostatistics, University of Minnesota.

  • Longford, N.T. (1987). A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects. Biometrika, 74, 817–827.

    Article  MathSciNet  Google Scholar 

  • Madigan, D., Gavrin, J. and Raftery, A.E. (1995). Eliciting prior information to enhance the predictive performance of Bayesian graphical models. Communications in Statistics, Theory and Methods, 24, 2271–2292.

    Article  MathSciNet  Google Scholar 

  • Mortimore, P., Sammons, P., Stoll, L., Lewis, D. and Ecob, R. (1988). School Matters. Wells: Open Books.

    Google Scholar 

  • Müller, P. (1993). A generic approach to posterior integration and Gibbs sampling. Technical Report, ISDS, Duke University, Durham NC.

    Google Scholar 

  • Pinheiro, J.C. and Bates, D.M. (1995). Approximations to the log-likelihood function in the non-linear mixed-effects model. Journal of Computational and Graphical Statistics, 4, 12–35.

    Google Scholar 

  • Raftery, A.L. and Lewis, S. (1992). How many iterations in the Gibbs sampler? In Bayesian Statistics 4, Bernardo, J.M., Berger, J.O., Dawid, A.P. and Smith, A.F.M. (eds.), 763–774. Oxford: Clarendon Press.

    Google Scholar 

  • Rasbash, J., Browne, W.J., Goldstein, H., Yang, M., Plewis, I., Draper, D., Healy, M. and Woodhouse, G. (1999). A User’s Guide to MLwiN, Version 2.0, London: Institute of Education, University of London.

    Google Scholar 

  • Raudenbush, S.W., Yang, M.-L. and Yosef, M. (2000). Maximum likelihood for hierarchical models via high-order multivariate Laplace approximations. Journal of Computational and Graphical Statistics, forthcoming.

  • Ripley, B.D. (1987). Stochastic Simulation. New York: Wiley.

    Book  Google Scholar 

  • Spiegelhalter, D.J., Thomas, A., Best, N.G. and Gilks, W.R. (1997). BUGS: Bayesian Inference Using Gibbs Sampling, Version 0.60. Cambridge: Medical Research Council Biostatistics Unit.

    Google Scholar 

  • Woodhouse, G., Rasbash, J., Goldstein, H., Yang, M., Howarth, J. and Plewis, I. (1995). A Guide toMLnfor New Users. London: Institute of Education, University of London.

    Google Scholar 

  • Zeger, S.L. and Karim, M.R. (1991). Generalized linear models with random effects: a Gibbs sampling approach. Journal of the American Statistical Association, 86, 79–86.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors, who may be contacted by email at bwjsmsr@ioe.ac.uk and dd@maths.bath.ac.uk, respectively, are grateful (a) to Harvey Goldstein and Jon Rasbash for a fruitful collaboration on MLwiN and for helpful discussions and comments, (b) to the EPSRC and ESRC for financial support, and (c) to Jim Hodges, Herbert Hoijtink, Dennis Lindley, and Steve Raudenbush for references and comments on this and/or related papers. Membership on this list does not imply agreement with the ideas expressed here, nor are any of these people responsible for any errors that may be present.

Author information

Authors and Affiliations

Authors

Appendix on MLwiN

Appendix on MLwiN

In a journal on computational statistics it may be of interest to briefly describe some implementation details of MLwiN. This multilevel modeling package, which has at this writing a worldwide user base in excess of 1,500, is a Windows version of MLn (Woodhouse et al. 1995) with many new features in addition to the port from DOS to Windows. The user interface (front end) is written in Visual Basic and the programming engine (back end) that performs the modeling is a slightly modified version of the old MLn package written in C++. The MCMC options have their own estimation engine which was originally a free-standing program written in C. This program has now been incorporated into the MLwiN package via interfacing code (in C++) that sends the MCMC routines the correct data for the current model including starting values and sends back estimates to the main program.

The Visual Basic routines allow the user to monitor, in real time, the progress across iterations of the maximum-likelihood and MCMC fitting methods in two ways: via an equations window, which refreshes the current numerical parameter estimates every R iterations, and a trajectories window, which graphs the estimates against iteration number. The use of both of these options comes at a significant MCMC run-time price because screen refreshes are slow relative to the MCMC calculations themselves. When using MCMC with a refresh rate of R = 50, the MCMC engine passes results back to the front end every 50 iterations. Consequently, to improve the speed of the iterations while still displaying trajectory plots in real time, many of the variables used in the MCMC engine are stored globally, meaning that they will not have to be recalculated each time the MCMC engine is called.

To get the fastest speed of MCMC estimation out of MLwiN, it is best to not show any of the windows, particularly the trajectories plots, and to increase the refresh rate, but doing this does not allow the user to monitor progress. Some idea of the tradeoffs involved may be gained from the following timings: on a 333MHz Pentium with 128Mb RAM, and fitting model (11) to the JSP data, a monitoring run of 5,000 iterations after a burn-in of 500 takes 33 seconds in real time with no windows displayed, 46 seconds with the equations window open, 65 seconds with the trajectories window running, and 76 seconds with both.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Browne, W.J., Draper, D. Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Computational Statistics 15, 391–420 (2000). https://doi.org/10.1007/s001800000041

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s001800000041

Keywords

Navigation