Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models

Browne, William J.; Draper, David

doi:10.1007/s001800000041

Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models

Published: 01 September 2000

Volume 15, pages 391–420, (2000)
Cite this article

Computational Statistics Aims and scope Submit manuscript

William J. Browne¹ &
David Draper²

1285 Accesses
121 Citations
Explore all metrics

Summary

We use simulation studies (a) to compare Bayesian and likelihood fitting methods, in terms of validity of conclusions, in two-level random-slopes regression (RSR) models, and (b) to compare several Bayesian estimation methods based on Markov chain Monte Carlo, in terms of computational efficiency, in random-effects logistic regression (RELR) models. We find (a) that the Bayesian approach with a particular choice of diffuse inverse Wishart prior distribution for the (co) variance parameters performs at least as well—in terms of bias of estimates and actual coverage of nominal 95% intervals—as maximum likelihood methods in RSR models with medium sample sizes (expressed in terms of the number J 7 of level-2 units), but neither approach performs as well as might be hoped with small J; and (b) that an adaptive hybrid Metropolis-Gibbs sampling method we have developed for use in the multilevel modeling package M1wiN outperforms adaptive rejection Gibbs sampling in the RELR models we have considered, sometimes by a wide margin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Level-specific residuals and diagnostic measures, plots, and tests for random effects selection in multilevel and mixed models

Article 01 March 2022

Handling dependent samples in meta-analytic structural equation models: A Wishart-based approach

Article 02 February 2024

Specification of random effects in multilevel models: a review

Article 29 July 2014

Notes

*For instance, from expert judgment (see, e.g., Madigan et al. 1995 for a method of eliciting a “prior data set” in the context of graphical models) or previous studies judged relevant to the current inquiry.
^†Jim Hodges (personal communication) has recently noted that there may be more potential problems with multimodality of posterior distributions in hierarchical models than is commonly believed; see Liu and Hodges (1999) for details. This may be investigated in MLwiN by making parallel runs with widely dispersed starting values, as in Gelman and Rubin (1992).
^‡For example, if the user wished to report ${\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}}\over \beta } _0} = 30.6 = 3.06 \cdot {10^1}$, i.e., k = 3, (10) would be applied with 6 = 1; whereas if 30 were subtracted from all data values and the user still insisted on k = 3, the estimate would now be 6.44 · 10⁻¹, (10) would now be invoked with all the same inputs except b = −1, and the new ${\hat n_M}$ value would be 10,000 times larger than before. In effect, in the presence of Monte Carlo uncertainty, it is just as hard to accurately announce a posterior mean of 30.644 when the posterior SD is (say) 0.371 as it is to quote a posterior mean of 0.644 with the same posterior SD.
^§This is a potentially dangerous strategy in small-sample settings on grounds of failure to propagate model uncertainty (e.g., Draper 1995), but the corrections required to adjust for having performed model selection and fitting on the same data set with, e.g., 48 schools and 887 students (as in the JSP data) should be modest.

References

Breslow, N.E. and Clayton, D.G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88, 9–25.
MATH Google Scholar
Brooks, S.P. and Draper, D. (2000). Comparing the efficiency of MCMC samplers. Technical report, Department of Mathematical Sciences, University of Bath, UK.
Google Scholar
Browne, W.J. (1998). Applying MCMC Methods to Multilevel Models. PhD dissertation, Department of Mathematical Sciences, University of Bath, UK.
Google Scholar
Browne, W.J. and Draper, D. (1999). A comparison of Bayesian and likelihood methods for fitting multilevel models. Submitted.
Bryk, A.S. and Raudenbush, S.W. (1992). Hierarchical Linear Models: Applications and Data Analysis Methods. London: Sage.
Google Scholar
Bryk, A.S., Raudenbush, S.W., Seltzer, M. and Congdon, R. (1988). An Introduction to HLM: Computer Program and User’s Guide (Second Edition). Chicago: University of Chicago Department of Education.
Google Scholar
Carlin, B. (1992). Discussion of “Hierarchical models for combining information and for meta-analysis,” by Morris, C.N. and Normand, S.L. In Bayesian Statistics4, Bernardo, J.M., Berger, J.O., Dawid, A.P. and Smith, A.F.M. (eds.), 336–338. Oxford: Clarendon Press.
Google Scholar
Draper, D. (1995). Assessment and propagation of model uncertainty (with discussion). Journal of the Royal Statistical Society, Series B, 57, 45–97.
MathSciNet MATH Google Scholar
Draper, D. (2000). Bayesian Hierarchical Modeling. New York: Springer-Verlag, forthcoming.
Google Scholar
Gelman, A., Carlin, J.B., Stern, H.S. and Rubin, D.B. (1995). Bayesian Data Analysis. London: Chapman & Hall.
Book Google Scholar
Gelman, A., Roberts, G.O. and Gilks, W.R. (1995). Efficient Metropolis jumping rules. In Bayesian Statistics 5, Bernardo, J.M., Berger, J.O., Dawid, A.P. and Smith, A.F.M. (eds.), 599–607. Oxford: Clarendon Press.
Google Scholar
Gelman, A. and Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences (with discussion). Statistical Science, 7, 457–511.
Article Google Scholar
Gilks, W.R., Richardson, S. and Spiegelhalter, D.J. (1996). Markov Chain Monte Carlo in Practice. London: Chapman & Hall.
MATH Google Scholar
Gilks, W.R. and Wild, P. (1992). Adaptive rejection sampling for Gibbs sampling. Applied Statistics, 41, 337–348.
Article Google Scholar
Goldstein, H. (1986). Multilevel mixed linear model analysis using iterative generalised least squares. Biometrika, 73, 43–56.
Article MathSciNet Google Scholar
Goldstein, H. (1989). Restricted unbiased iterative generalised least squares estimation. Biometrika, 76, 622–623.
Article MathSciNet Google Scholar
Goldstein, H. (1995). Multilevel Statistical Models, Second Edition. London: Edward Arnold.
MATH Google Scholar
Heath, A., Yang, M. and Goldstein, H. (1996). Multilevel analysis of the changing relationship between class and party in Britain, 1964–1992. Quality and Quantity, 30, 389–404.
Article Google Scholar
Liu, J. and Hodges, J.S. (1999). Characterizing modes of the likelihood, restricted likelihood, and posterior for hierarchical models. Technical Report 99–011, Division of Biostatistics, University of Minnesota.
Longford, N.T. (1987). A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects. Biometrika, 74, 817–827.
Article MathSciNet Google Scholar
Madigan, D., Gavrin, J. and Raftery, A.E. (1995). Eliciting prior information to enhance the predictive performance of Bayesian graphical models. Communications in Statistics, Theory and Methods, 24, 2271–2292.
Article MathSciNet Google Scholar
Mortimore, P., Sammons, P., Stoll, L., Lewis, D. and Ecob, R. (1988). School Matters. Wells: Open Books.
Google Scholar
Müller, P. (1993). A generic approach to posterior integration and Gibbs sampling. Technical Report, ISDS, Duke University, Durham NC.
Google Scholar
Pinheiro, J.C. and Bates, D.M. (1995). Approximations to the log-likelihood function in the non-linear mixed-effects model. Journal of Computational and Graphical Statistics, 4, 12–35.
Google Scholar
Raftery, A.L. and Lewis, S. (1992). How many iterations in the Gibbs sampler? In Bayesian Statistics 4, Bernardo, J.M., Berger, J.O., Dawid, A.P. and Smith, A.F.M. (eds.), 763–774. Oxford: Clarendon Press.
Google Scholar
Rasbash, J., Browne, W.J., Goldstein, H., Yang, M., Plewis, I., Draper, D., Healy, M. and Woodhouse, G. (1999). A User’s Guide to MLwiN, Version 2.0, London: Institute of Education, University of London.
Google Scholar
Raudenbush, S.W., Yang, M.-L. and Yosef, M. (2000). Maximum likelihood for hierarchical models via high-order multivariate Laplace approximations. Journal of Computational and Graphical Statistics, forthcoming.
Ripley, B.D. (1987). Stochastic Simulation. New York: Wiley.
Book Google Scholar
Spiegelhalter, D.J., Thomas, A., Best, N.G. and Gilks, W.R. (1997). BUGS: Bayesian Inference Using Gibbs Sampling, Version 0.60. Cambridge: Medical Research Council Biostatistics Unit.
Google Scholar
Woodhouse, G., Rasbash, J., Goldstein, H., Yang, M., Howarth, J. and Plewis, I. (1995). A Guide toMLnfor New Users. London: Institute of Education, University of London.
Google Scholar
Zeger, S.L. and Karim, M.R. (1991). Generalized linear models with random effects: a Gibbs sampling approach. Journal of the American Statistical Association, 86, 79–86.
Article MathSciNet Google Scholar

Download references

Acknowledgments

The authors, who may be contacted by email at bwjsmsr@ioe.ac.uk and dd@maths.bath.ac.uk, respectively, are grateful (a) to Harvey Goldstein and Jon Rasbash for a fruitful collaboration on MLwiN and for helpful discussions and comments, (b) to the EPSRC and ESRC for financial support, and (c) to Jim Hodges, Herbert Hoijtink, Dennis Lindley, and Steve Raudenbush for references and comments on this and/or related papers. Membership on this list does not imply agreement with the ideas expressed here, nor are any of these people responsible for any errors that may be present.

Author information

Authors and Affiliations

Institute of Education, University of London, 20 Bedford Way, London, WC1H 0AL, England
William J. Browne
Department of Mathematical Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, England
David Draper

Authors

William J. Browne
View author publications
You can also search for this author in PubMed Google Scholar
David Draper
View author publications
You can also search for this author in PubMed Google Scholar

Appendix on MLwiN

In a journal on computational statistics it may be of interest to briefly describe some implementation details of MLwiN. This multilevel modeling package, which has at this writing a worldwide user base in excess of 1,500, is a Windows version of MLn (Woodhouse et al. 1995) with many new features in addition to the port from DOS to Windows. The user interface (front end) is written in Visual Basic and the programming engine (back end) that performs the modeling is a slightly modified version of the old MLn package written in C++. The MCMC options have their own estimation engine which was originally a free-standing program written in C. This program has now been incorporated into the MLwiN package via interfacing code (in C++) that sends the MCMC routines the correct data for the current model including starting values and sends back estimates to the main program.

The Visual Basic routines allow the user to monitor, in real time, the progress across iterations of the maximum-likelihood and MCMC fitting methods in two ways: via an equations window, which refreshes the current numerical parameter estimates every R iterations, and a trajectories window, which graphs the estimates against iteration number. The use of both of these options comes at a significant MCMC run-time price because screen refreshes are slow relative to the MCMC calculations themselves. When using MCMC with a refresh rate of R = 50, the MCMC engine passes results back to the front end every 50 iterations. Consequently, to improve the speed of the iterations while still displaying trajectory plots in real time, many of the variables used in the MCMC engine are stored globally, meaning that they will not have to be recalculated each time the MCMC engine is called.

To get the fastest speed of MCMC estimation out of MLwiN, it is best to not show any of the windows, particularly the trajectories plots, and to increase the refresh rate, but doing this does not allow the user to monitor progress. Some idea of the tradeoffs involved may be gained from the following timings: on a 333MHz Pentium with 128Mb RAM, and fitting model (11) to the JSP data, a monitoring run of 5,000 iterations after a burn-in of 500 takes 33 seconds in real time with no windows displayed, 46 seconds with the equations window open, 65 seconds with the trajectories window running, and 76 seconds with both.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Browne, W.J., Draper, D. Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Computational Statistics 15, 391–420 (2000). https://doi.org/10.1007/s001800000041

Download citation

Published: 01 September 2000
Issue Date: September 2000
DOI: https://doi.org/10.1007/s001800000041

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models

Summary

Access this article

Similar content being viewed by others

Level-specific residuals and diagnostic measures, plots, and tests for random effects selection in multilevel and mixed models

Handling dependent samples in meta-analytic structural equation models: A Wishart-based approach

Specification of random effects in multilevel models: a review

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Appendix on MLwiN

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models

Summary

Access this article

Similar content being viewed by others

Level-specific residuals and diagnostic measures, plots, and tests for random effects selection in multilevel and mixed models

Handling dependent samples in meta-analytic structural equation models: A Wishart-based approach

Specification of random effects in multilevel models: a review

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Appendix on MLwiN

Appendix on MLwiN

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation