Behavior Research Methods

, Volume 51, Issue 1, pp 243–257 | Cite as

Comparison of model- and design-based approaches to detect the treatment effect and covariate by treatment interactions in three-level models for multisite cluster-randomized trials

  • Burak AydinEmail author
  • James Algina
  • Walter L. Leite


In this study, we evaluated the estimation of three important parameters for data collected in a multisite cluster-randomized trial (MS-CRT): the treatment effect, and the treatment by covariate interactions at Levels 1 and 2. The Level 1 and Level 2 interaction parameters are the coefficients for the products of the treatment indicator, with the covariate centered on its Level 2 expected value and with the Level 2 expected value centered on its Level 3 expected value, respectively. A comparison of a model-based approach to design-based approaches was performed using simulation studies. The results showed that both approaches produced similar treatment effect estimates and interaction estimates at Level 1, as well as similar Type I error rates and statistical power. However, the estimate of the Level 2 interaction coefficient for the product of the treatment indicator and an arithmetic mean of the Level 1 covariate was severely biased in most conditions. Therefore, applied researchers should be cautious when using arithmetic means to form a treatment by covariate interaction at Level 2 in MS-CRT data.


Three-level models Covariate by treatment interaction Design-based Model-based Multisite cluster-randomized trials 


  1. Aguinis, H., Gottfredson, R. K., & Culpepper, S. A. (2013). Best-practice recommendations for estimating cross-level interaction effects using multilevel modeling. Journal of Management, 39, 1490–1528.CrossRefGoogle Scholar
  2. Asparouhov, T. (2005). Sampling weights in latent variable modeling. Structural Equation Modeling, 12, 411–434.CrossRefGoogle Scholar
  3. Asparouhov, T., & Muthén, B. O. (2006). Multilevel modeling of complex survey data. Los Angeles, CA: ASA Section on Survey Research Methods. Available from Google Scholar
  4. Aydin, B., Leite, W. L., & Algina, J. (2016). The effects of including observed means or latent means as covariates in multilevel models for cluster randomized trials. Educational and Psychological Measurement, 76, 803–823.CrossRefGoogle Scholar
  5. Bandalos, D. L., & Leite, W. L. (2013). Use of Monte Carlo studies in structural equation modeling research. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (2nd ed.) (pp. 564–666). Greenwich, CT: Information Age.Google Scholar
  6. Barbui, C., & Cipriani, A. (2011). Cluster randomised trials. Epidemiology and Psychiatric Sciences, 20, 307–309.CrossRefGoogle Scholar
  7. Bauer, D. J., & Sterba, S. K. (2011). Fitting multilevel models with ordinal outcomes: Performance of alternative specifications and methods of estimation. Psychological Methods, 16, 373–390. doi: CrossRefGoogle Scholar
  8. Bauer, D., & Curran, P. (2005). Probing interactions in fixed and multilevel regression: Inferential and graphical techniques. Multivariate Behavioral Research, 40, 373–400. CrossRefGoogle Scholar
  9. Bloom, H. S., & Spybrook, J. (2017). Assessing the precision of multisite trials for estimating the parameters of a cross-site population distribution of program effects. Journal of Research on Educational Effectiveness, 10 , 877–902.
  10. Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31, 144–152.CrossRefGoogle Scholar
  11. Brincks, A. M., Enders, C. K., Llabre, M. M., Bulotsky-Shearer, R. J., Prado, G., & Feaster, D. J. (2017). Centering predictor variables in three-level contextual models. Multivariate Behavioral Research, 52, 149–163. CrossRefGoogle Scholar
  12. Cochran, W. G. (1977). Sampling techniques. New York, NY: Wiley.Google Scholar
  13. Croon, M. A., & van Veldhoven, M. J. P. M. (2007). Predicting group-level outcome variables from variables measured at the individual level: A latent variable multilevel model. Psychological Methods, 12, 45–57. CrossRefGoogle Scholar
  14. Dong, N., Kelcey, B., & Spybrook, J. (2017). Power analyses for moderator effects in three-level cluster randomized trials. Journal of Experimental Education, 86, 489–514. CrossRefGoogle Scholar
  15. Donner, A., & Klar, N. (2004). Pitfalls of and controversies in cluster randomization trials. American Journal of Public Health, 94, 416–422.CrossRefGoogle Scholar
  16. Feng, Z., Diehr, P., Peterson, A., & McLerran, D. (2001). Selected statistical issues in group randomized trials. Annual Review of Public Health, 22, 167–187.CrossRefGoogle Scholar
  17. Gardiner, J., Luo, Z., & Roman, L. (2009). Fixed effects, random effects and gee: What are the differences? Statistical Medicine, 28, 221–239. CrossRefGoogle Scholar
  18. Ghisletta, P., & Spini, D. (2004). An introduction to generalized estimating equations and an application to assess selectivity effects in a longitudinal study on very old individuals. Journal of Educational and Behavioral Statistics, 29, 421–437.CrossRefGoogle Scholar
  19. Hedges, L. V., & Hedberg, E. C. (2013). Intraclass correlations and covariate outcome correlations for planning two-and three-level cluster-randomized experiments in education. Evaluation Review, 37, 445–489.CrossRefGoogle Scholar
  20. Hong, G. (2015). Causality in a social world: Moderation, mediation, and spill-over. West Sussex, UK: Wiley-Blackwell.CrossRefGoogle Scholar
  21. Hoogland, J. J., & Boomsma, A. (1998). Robustness studies in covariance structure modeling. Sociological Methods & Research, 26, 329–367. CrossRefGoogle Scholar
  22. Hox, J. J., Maas, C. J. M., & Brinkhuis, M. J. S. (2010). The effect of estimation method and sample size in multilevel structural equation modeling. Statistica Neerlandica, 64, 157–170.CrossRefGoogle Scholar
  23. Huang, F. L. (2016). Using cluster bootstrapping to analyze nested data with a few clusters. Educational and Psychological Measurement, 78, 297–318. CrossRefGoogle Scholar
  24. Hubbard, A. E., Ahern, J., Fleischer, N. L., Van der Laan, M., Lippman, S. A., Jewell, N., . . . Satariano, W. A. (2010). To GEE or not to GEE: Comparing population average and mixed models for estimating the associations between neighborhood risk factors and health. Epidemiology, 21, 467–474.CrossRefGoogle Scholar
  25. Josephy, H., Vansteelandt, S., Vanderhasselt, M.-A., & Loeys, T. (2015). Within-subject mediation analysis in ab/ba crossover designs. International Journal of Biostatistics, 11, 1–22.CrossRefGoogle Scholar
  26. Kelcey, B., Spybrook, J., Phelps, G., Jones, N., & Zhang, J. (2017). Designing large-scale multisite and cluster-randomized studies of professional development. Journal of Experimental Education, 85, 389–410.CrossRefGoogle Scholar
  27. Kenward, M. G., & Roger, J. H. (2009). An improved approximation to the precision of fixed effects from restricted maximum likelihood. Computational Statistics and Data Analysis, 53, 2583–2595.CrossRefGoogle Scholar
  28. Kraemer, H. C. (2000). Pitfalls of multisite randomized clinical trials of efficacy and effectiveness. Schizophrenia Bulletin, 26, 533–541.CrossRefGoogle Scholar
  29. Lüdtke, O., Marsh, H. W., Robitzsch, A., Trautwein, U., Asparouhov, T., & Muthén, B. (2008). The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychological Methods, 13, 203–229. CrossRefGoogle Scholar
  30. Mathieu, J. E., Aguinis, H., Culpepper, S. A., & Chen, G. (2012). Understanding and estimating the power to detect cross-level interaction effects in multilevel modeling. Journal of Applied Psychology, 97, 951–966. CrossRefGoogle Scholar
  31. McNeish, D. M. (2014). Modeling sparsely clustered data: Design-based, model-based, and single-level methods. Psychological Methods, 19, 552–563. CrossRefGoogle Scholar
  32. McNeish, D. (2017). Multilevel mediation with small samples: A cautionary note on the multilevel structural equation modeling framework. Structural Equation Modeling, 24, 609–625.CrossRefGoogle Scholar
  33. McNeish, D. M., & Harring, J. R. (2017). Clustered data with small sample sizes: Comparing the performance of model-based and design-based approaches. Communications in Statistics: Simulation and Computation, 46, 855–869.CrossRefGoogle Scholar
  34. McNeish, D., & Stapleton, L. M. (2016). Modeling clustered data with very few clusters. Multivariate Behavioral Research, 51, 495–518. CrossRefGoogle Scholar
  35. McNeish, D., Stapleton, L. M., & Silverman, R. D. (2017). On the unnecessary ubiquity of hierarchical linear modeling. Psychological Methods, 22, 114–140. CrossRefGoogle Scholar
  36. McNeish, D., & Wentzel, K. R. (2017). Accommodating small sample sizes in three-level models when the third level is incidental. Multivariate Behavioral Research, 52, 200–215. CrossRefGoogle Scholar
  37. Moerbeek, M. (2004). The consequence of ignoring a level of nesting in multilevel analysis. Multivariate Behavioral Research, 39, 129–149. CrossRefGoogle Scholar
  38. Moerbeek, M., & Teerenstra, S. (2015). Power analysis of trials with multilevel data. Boca Raton, FL: CRC Press.CrossRefGoogle Scholar
  39. Murray, D. M., Hannan, P. J., Pals, S. P., McCowen, R. G., Baker, W. L., & Blitstein, J. L. (2006). A comparison of permutation and mixed-model regression methods for the analysis of simulated data in the context of a group-randomized trial. Statistics in Medicine, 25, 375–388.CrossRefGoogle Scholar
  40. Muthén, L.K., & Muthén, B.O. (1998–2015). Mplus user’s guide (7th ed.). Los Angeles, CA: Muthén & Muthén.Google Scholar
  41. Nevalainen, J., Oja, H., & Datta, S. (2017). Tests for informative cluster size using a novel balanced bootstrap scheme. Statistics in Medicine, 36, 2630–2640. CrossRefGoogle Scholar
  42. Olejnik, S., & Algina, J. (2003). Generalized eta and omega squared statistics: Measures of effect size for some common research designs. Psychological Methods, 8, 434–447. CrossRefGoogle Scholar
  43. Preacher, K. J., Curran, P. J., & Bauer, D. J. (2006). Computational tools for probing interactions in multiple linear regression, multilevel modeling, and latent curve analysis. Journal of Educational and Behavioral Statistics, 31, 437–448.Google Scholar
  44. Preacher, K. J., Zhang, Z., & Zyphur, M. J. (2016). Multilevel structural equation models for assessing moderation within and across levels of analysis. Psychological Methods, 21, 189–205.
  45. R Core Team. (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from Scholar
  46. Rabe-Hesketh, S., & Skrondal, A. (2006). Multilevel modelling of complex survey data. Journal of the Royal Statistical Society: Series A, 169, 805–827.CrossRefGoogle Scholar
  47. Raudenbush, S. W., & Bloom, H. S. (2015). Learning about and from a distribution of program impacts using multisite trials. American Journal of Evaluation, 36, 475–499.CrossRefGoogle Scholar
  48. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed., Vol. 1). Thousand Oaks, CA: Sage.Google Scholar
  49. Raudenbush, S. W., & Liu, X. (2000). Statistical power and optimal design for multisite randomized trials. Psychological Methods, 5, 199–213. CrossRefGoogle Scholar
  50. Ruud, K. L., LeBlanc, A., Mullan, R. J., Pencille, L. J., Tiedje, K., Branda, M. E., . . . Montori, V. M. (2013). Lessons learned from the conduct of a multisite cluster randomized practical trial of decision aids in rural and suburban primary care practices. Trials, 14, 267.
  51. Ryu, E. (2015). The role of centering for interaction of level 1 variables in multilevel structural equation models. Structural Equation Modeling, 22, 617–630. CrossRefGoogle Scholar
  52. Shin, Y., & Raudenbush, S. W. (2010). A latent cluster-mean approach to the contextual effects model with missing data. Journal of Educational and Behavioral Statistics, 35, 26–53.CrossRefGoogle Scholar
  53. Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Los Angeles, CA: Sage.Google Scholar
  54. Spybrook, J., Bloom, H., Congdon, R., Hill, C., Martinez, A., & Raudenbush, S. (2011). Optimal design plus empirical evidence: Documentation for the “optimal design” software (Software manual). Retrieved from
  55. Sterba, S. K. (2009). Alternative model-based and design-based frameworks for inference from samples to populations: From polarization to integration. Multivariate Behavioral Research, 44, 711–740. CrossRefGoogle Scholar
  56. Wijekumar, K., Hitchcock, J., Turner, H., Lei, P., & Peck, K. (2009). A multisite cluster randomized trial of the effects of compass-learning odyssey [r] math on the math achievement of selected Grade 4 students in the mid-Atlantic region (Final report. NCEE 2009-4068). Washington, DC: National Center for Education Evaluation and Regional Assistance.Google Scholar
  57. Wu, J.-Y., & Kwok, O.-M. (2012). Using SEM to analyze complex survey data: A comparison between design-based single-level and model-based multilevel approaches. Structural Equation Modeling, 19, 16–35. CrossRefGoogle Scholar

Copyright information

© Psychonomic Society, Inc. 2018

Authors and Affiliations

  1. 1.School of EducationRecep Tayyip Erdogan UniversityRizeTurkey
  2. 2.School of Human Development and Organizational Studies in EducationUniversity of FloridaGainesvilleUSA

Personalised recommendations